ISSN 1671-3710
CN 11-4766/R
主办:中国科学院心理研究所
出版:科学出版社

心理科学进展, 2018, 26(9): 1535-1544 doi: 10.3724/SP.J.1042.2018.01535

研究构想

汉语言语产生的语音加工单元——基于音位的研究

屈青青,, 刘维琳, 李兴珊

中国科学院大学心理学系, 北京 100049

The functional unit of phonological encoding in Chinese spoken production: Study on phonemes

QU Qingqing,, LIU Weilin, LI Xingshan

Department of Psychology, University of Chinese Academy of Sciences, Beijing 100049, China

通讯作者: * 屈青青, E-mail:quqq@psych.ac.cn

收稿日期: 2017-11-3   网络出版日期: 2018-09-15

基金资助: 国家自然科学基金(31771212)

Received: 2017-11-3   Online: 2018-09-15

Fund supported: (31771212)

摘要

言语产生的语音加工单元具有跨语言的特异性。在印欧语言中, 音位是语音加工的重要功能单元。音位指具体语言中能够区别意义的最小语音单位, 如“big”包含三个音位/b/, /i/, /g/。目前, 在汉语言语产生中, 对音位的研究较少。本项目拟采用事件相关电位技术, 对汉语言语产生中的音位加工进行探讨, 试图考察:在汉语言语产生中, 1)音位加工的心理现实性, 以及音位表征是否受第二语言、汉语拼音习得、拼音使用经验的影响?2)音位的加工机制是怎样的?具体而言, 音位加工的特异性、位置编码、组合方式、时间进程是怎样的?对这些问题的回答, 将有助于深化对汉语言语产生的认识, 为建立汉语言语产生计算模型提供基础; 为比较印欧语言与汉语在机制上的异同提供基础; 为制定汉语语音教育教学方法提供心理学依据。

关键词: 言语; 汉语; 语音加工单元; 音位; 事件相关电位

Abstract

Speech production studies have demonstrated cross-linguistic differences in the processing units involved in phonological encoding. It has been widely assumed a critical role of phonemes in spoken production of Indo-European languages. Phonemes are normally conceived of as abstractions of discrete segmental speech sounds which can distinguish the meaning between words. For instance, the word "big" represents a sequence of three phonemes /b/, /i/, /g/. Currently, investigations on the processing units in Chinese spoken production mainly focused on syllables, whereas only few studies concern the role of phonemes. In the present project, we propose to comprehensively tackle the role of phoneme in Chinese speech production, focusing on its psychological reality, potential factors influencing phoneme-based effects, processing mechanism and temporal properties, using both behavioral and electrophysiological techniques. Specifically, we will investigate: 1) whether phonemic processing has "psychological reality" in Chinese speech production, and whether sensitivity to phonemic representations is artificially induced by exposure to English as a second language, experience of Pinyin, or phoneme-based typing input method? 2) how we process phonemes? Specifically, we are interested in whether phoneme-based effects are phoneme-specific and position-specific, how phonemes associate together to form larger units, and the temporal properties of phonemic processing. The findings of the present project will not only improve our understanding of how Chinese speakers produce words in mind/brain, but also provide insights into the construction and development of theoretical and computational models of Chinese speech production. In addition, the findings will provide basis for cross-language comparisons, and the development of scientific teaching approaches of Chinese phonology.

Keywords: spoken production; Chinese; functional units in phonological encoding; phoneme; ERPs

PDF (384KB) 元数据 多维度评价 相关文章 导出 EndNote| Ris| Bibtex  收藏本文

本文引用格式

屈青青, 刘维琳, 李兴珊. 汉语言语产生的语音加工单元——基于音位的研究. 心理科学进展, 2018, 26(9): 1535-1544 doi:10.3724/SP.J.1042.2018.01535

QU Qingqing, LIU Weilin, LI Xingshan. The functional unit of phonological encoding in Chinese spoken production: Study on phonemes. Advances in Psychological Science, 2018, 26(9): 1535-1544 doi:10.3724/SP.J.1042.2018.01535

1 研究背景

言语产生, 作为将思想转化为口头言语表达的认知加工过程, 包含以下四个主要加工阶段:1)概念准备, 即说话者首先需要明确试图表达的概念意图; 2)词条选择, 即为所准备的概念从心理词典中提取适当的词汇; 3)音韵编码, 即构建词汇的语音信息; 4)发声执行阶段, 将计划的词汇用外显的声音表达出来(Caramazza, 1997; Dell, 1986, 1988; Levelt, Roelofs, & Meyer, 1999; Rapp & Goldrick, 2000)。在言语产生领域, 音韵编码作为言语产生的关键过程, 近年来受到了广泛的关注。音韵编码所涉及的语音加工单元是最热门的核心科学问题之一。基于英语或荷兰语等印欧语言的理论/模型(Caramazza, 1997; Dell, 1986, 1988; Levelt, 1989; Levelt et al., 1999)均假设在言语产生的音韵编码系统中存在着抽象的音位表征(phonemic representation), 音位是语音信息的提取、编码的重要功能单元。

不同语言(或方言)在音韵特征上存在着显著差异(Ladefoged, 2001; Ladefoged & Maddieson, 1996)。这种音韵特征的差异很可能导致不同语言间语音表征的不同。研究发现, 与印欧语言不同, 音位在日语言语产生中的作用甚微, 而音拍(mora)的作用显著(Kureta, Fushimi, & Tatsumi, 2006)。总之, 现有研究提示, 语音加工单元具有跨语言的特异性。

目前, 针对汉语言语产生中的语音表征单元的研究已获得一些进展, 然而, 这些研究多集中在对“音节在汉语言语产生中的作用”的探讨, 对音位的实证研究还非常少。其实, 中国古代学者就开始了对汉语音节的下一级、更小单位的分析。在东汉时期, 人们发明了注音方法“反切”, 以两个字来表示一个音节的两部分:声母和韵母。在明朝, 人们把汉语音节划分为类似音位的单元:声母、韵头、韵腹、韵尾。直到上世纪, 中国学者开始将西方的音位学理论运用于汉语研究, 以音位(元音、辅音)分析普通话。那么, 在汉语产生中, 音位加工是否具有心理现实性呢?如果有, 音位的加工机制又是怎样的呢?这些重要的科学问题还有待探讨。

2 国内外研究现状

2.1 音位(phoneme)1(1 值得注意的是, 音位(phoneme)与音素(phone)不同。音素是根据语音的自然属性划分出来的最小语音单位。从生理性质来看, 一个发音动作形成一个音素。不同发音动作发出的音, 代表不同的音素。如:在英语单词 “paper”中, 第一个/p/为送气音[ph], 第二个p为非送气清音[p], [ph]和[p]属于同一个音位的不同的音素。音素一般用国际音标记音, 用方括号[]标明。对这些不同的音素, 即使是该语言的母语者甚至可能都觉察不到语音上的差别。在本项目中, 我们关注的是具有抽象特征的音位, 而非音素。)

音位是具体语言中有区别词的语音形式作用的最小语音单位(叶蜚声, 徐通锵, 2010), 是按语音的辨义作用归纳出来的音类, 不同语言的音位系统不能进行对应(黄伯荣, 廖序东, 2011)。当两个最小的语音单位可以区分不同的词汇时, 它们即构成了不同的音位。比如, 在汉语普通话中, “米”和“你”就是依靠两个辅音/m/、/n/区分的, 具有区别词的语音形式, 区别词的意义的作用, 所以/m/、/n/归为两个音位。在英语中, /b/、/p/属于两个不同的音位, 因为他们可以区分“big”、“pig”; 在“sin”、“sing”中, 尾音/n/、/ŋ/构成了两个不同的音位。音位一般用两条斜线标明。音位2(2 音位可以分为音段音位(又称音质音位)和超音段音位(又称非音质音位)。超音段音位是具有区别词的语音形式的作用的音高、音强、音长。如汉语中的四个声调是具有区别词的语音形式的音高变化, 称为调位。在本项目中, 我们只关注音段音位, 在这里, 简称为“音位”。)包含元音音位和辅音音位, 从元音中归纳出来的称为元音音位, 如/a/、/o/、/e/; 从辅音中归纳出来的称为辅音音位, 如/b/、/p/、/m/。不同语言所使用的音位数量有所差异, 在汉语普通话中, 存在22个辅音音位和10个元音音位(叶蜚声, 徐通锵, 2010, 详见表1)。在英语中, 音位数量随方言而变化, 最多可达到24个辅音音位, 20个元音音位。传统语言学理论认为每种语言内均包含一组较少数量的音位, 这些音位按照各种方式进行组合, 从而表征每种语言的数千个词汇(Chomsky & Halle, 1968; Trubetzkoy, 1969)。

表1   音位定义以及汉语音位

概念定义汉语音位(以汉语拼音字母标示如下)
音位
(phoneme)
具体语言中能够区分词义的最小的语音单位, 也就是按语音的辨义作用归纳出来的音类。汉语普通话中有10个元音音位(a、o、e、i、u、ü、ai、ei、ui、ue), 22个辅音音位(b、p、m、f、d、t、n、l、g、k、h、j、q、x、z、c、s、zh、ch、sh、r、ng)。

新窗口打开| 下载CSV


2.2 印欧语系言语产生中音位的心理现实性

正如上文所述, 语音加工单元, 即在长时记忆中存储和提取语音形式的功能单元(Baudouin de Courtenay, 1972), 是言语产生领域的核心科学问题。现有的实验证据和模型均支持音位是印欧语言音韵编码的重要功能单元。支持音位为语音加工单元的最初证据来自于语误分析。研究者发现大多数的语误涉及单个音位的插入, 删除, 替换或转换(e.g., York library→ lork yibrary, reading list → leading list; Dell, 1986; Garrett, 1980; Meringer & Mayer, 1895; Shattuck-Hufnagel, 1979)。而涉及整个音节(e.g., napkin → kinnap)或单个语音特征(e.g., blue → plue)的语音错误却相当少(Shattuck- Hufnagel, 1979, 1983)。语误分析为音位的心理现实性提供了直接证据。

支持音位的心理现实性的证据还来自于实验研究。研究者采用内隐启动范式(implicit priming paradigm), 让被试先学习几个单词对(刺激词-反应词), 在测试阶段, 只呈现刺激词, 要求被试又快又准确地说出对应的反应词。在同源条件下, 同一个block内所有的反应词具有某种相同的词形信息, 异源条件的反应词无关。研究发现, 当反应词之间的重叠部分出现在词首时, 同源组的反应时间快于异源组(Meyer, 1990; Roelofs, 1996, 1998; 张清芳, 2008)。Meyer (1991)利用该范式, 操控反应词之间是否存在首音位重叠。结果表明, 当反应词之间存在首音位重叠时(e.g., /d/: dans- dop-deugd-doek-dier, 实验材料为荷兰语), 反应时间显著快于异源组。在其他语言中, 也发现了相同的音位促进效应, 为音位加工的心理现实性提供了证据(see Alario, Perre, Castel, & Ziegler, 2007 for French; Damian & Bowers, 2003 for English)。

另一项支持音位加工的证据来自于颜色图片命名范式(coloured picture naming paradigm, Damian & Dumay, 2007, 2009)。在这一范式中, 向被试呈现有颜色的图片, 要求被试又快又准确地报告出颜色+图片名称。Damian和Dumay (2007, 2009)操纵了颜色和图片名称的相关, 使得两者具有相同的首音位, 或是完全无关。研究结果发现, 与无关条件相比, 当颜色和图片名称具有相同的首音位时(red rope), 命名反应时更短。另外, 这种音位启动效应不局限在首音位上, 当颜色和图片名称的重叠音位位于中间元音或末尾辅音时, 同样存在启动效应, 从而排除了“词首效应”3(3 词首效应, 指效应来自于词首字母的重要性, 如:语误分析中的“词首效应”, 表现为词首音位比其他位置的音位发生语误的频率更高(Dell, 1984; MacKay, 1972)。在词汇识别领域, 同样发现, 词汇的首字母对词汇的识别比其他位置的字母更重要, 如首字母转置对词汇识别的干扰要显著大于词中间或词尾转置(Rayner, White, Johnson, & Liversedge, 2006; White, Johnson, Liversedge, & Rayner, 2008)。)假说, 为音位的心理现实性提供了证据。

2.3 印欧语系言语产生中音位的加工机制

自20世纪90年代以来, 研究者们从多角度对音位的加工机制进行了研究, 并取得了一些进展。在内隐启动范式中, 研究发现, 当反应词仅具有相似的音韵特征(phonological features)时(e.g., /b/和/p/在音韵特征上相似, 属于两个不同的音位), 同源与异源条件的反应时不存在差异, 排除了“语音特征相似”假说, 为荷兰语产生中的音位特异性(而非音韵特征组合)提供了证据。另外, 只有当相同音位出现在首音位时, 才会产生启动效应; 相同音位出现在韵母或是韵尾位置时, 不会出现音位启动效应, 且启动效应量随着相同音位的数量增加而变大。据此, Meyer (1991)提出, 音位的组织是一个从左到右、线性递增的组合方式, 从首音到核心元音, 最后到尾音部分。

不同的言语产生模型对于音位的位置特定性提出了不同的假设。Dell (1986)模型提出, 位于词汇中不同位置的相同音位具有不同的表征, 即音位具有位置特定性。如:green中的/g/与flag中的/g/属于两个不同的表征, 因为前者为元音前辅音, 后者为元音后辅音。Roelofs (1997)提出的WEAVER模型则假设, 音位以序列的方式从左至右插入音节框架, 并不具有位置特定性。“位置特定假设”预期音位启动效应只发生在重复音位在词汇中的位置相匹配时。而“位置非特定假设”则预测无论音位的位置是否匹配, 均会产生启动效应。研究结果(Damian & Dumay, 2009)支持了“位置非特定假设”。该研究利用颜色图片命名任务, 发现音位在两个词汇中的位置不匹配时(如, green frog, 相同辅音/g/分别在词首和词尾), 仍表现出相似的音位启动效应, 从而支持了音位的位置非特定性(Damian & Dumay, 2009)。

早期考察言语产生时间进程的研究通常采用非自然的语言任务和ERP技术相结合的实验方法。例如, 利用双重判断任务范式(Dual-Choice Reaction GO/NOGO Paradigm), van Turennout, Hagoort和Brown (1997)以单侧化准备电位(Lateralized Readiness Potential, LRP)为电生理指标, 发现语义加工早于语音加工, 且每个音位加工大概需要25 ms。对这种非自然、不发音的按键任务能否探测自然语言产生过程还存在争议, 因此研究者试图采用自然发音命名的任务和ERP技术相结合的手段对时间进程进行了探讨。Jescheniak, Hahne和Schriefers (2003)采用延迟反应的图-词干扰范式发现:语音相关与无关条件诱发的波幅在400~1000 ms内有显著差异。Dell’Acqua等(2010)采用图-词干扰范式, 操纵图片名称与干扰词之间的语音相关性, 结果发现在刺激呈现后250~450 ms的时间窗口内, 语音相关与无关条件所诱发的波幅存在显著差异。采用掩蔽启动范式, Blackford, Holcomb, Grainger和Kuperberg (2012)发现语音相关与无关条件在350~550 ms时间窗口存在显著差异。Indefrey和Levelt (2004)采用元分析方法, 对58个脑电实验进行分析, 提出音韵编码所发生的时间进程大致为275~445 ms。

综上, 在印欧语系中, 通过多种研究范式, 为音位加工的心理现实性提供了佐证, 并对音位的加工机制进行了探讨, 研究结果证实了音位的特异性、位置非特定性, 从左到右、线性递增的组合方式以及时间进程。

2.4 汉语言语产生中音位的心理现实性

较之字母语言, 汉语具有独特的音韵特点和正字法结构:a)在汉语中, 每个汉字对应着一个音节。与大部分字母语言相比, 汉语的音节数量相对较少(大约400个左右); b)在字母语言中, 音位信息对应着字母或字母组合(/b/-b, /f/-ph), 即音位有其正字法上的表征。而汉语中的音位信息没有对应的笔划或笔划组合, 即汉语音位没有正字法上的表征; c)古代研究汉语音韵的学者根据汉语音节结构的特点, 将音节划分为声母和韵母; d)汉语母语者可以分辨出不同音位带来的差异, 如汉语母语者可以觉察出/mi/与/ni/的差异(黄伯荣, 廖序东, 2011)。不同语言之间在音韵和正字法特征上的显著差异, 可能导致语音加工单元的跨语言差异。而上述的音韵与正字法特征可能导致汉语的语音加工单位为“具有音位表征的音节” (phonemically specified syllables, O’Seaghdha, Chen, & Chen, 2010)。

近年来, 研究者采用行为实验对汉语言语产生过程中的语音加工单元进行了研究。与拼音文字的研究结果不同, 多数汉语的实验证据支持音节单元说。Chen等人(Chen, Chen, & Dell, 2002; O’Seaghdha et al., 2010)在内隐启动范式中发现, 反应词之间的音节重叠产生了显著的促进效应, 而音位重叠对反应时的影响有限。You, Zhang和Verdonschot (2012)在掩蔽启动范式中, 发现亚音节水平上的重叠不存在启动效应, 只有音节水平上的重叠才存在启动效应, 强调了音节在汉语产生中的心理现实性。根据印欧语言与汉语不同的研究结果, O’Seaghdha等人(2010)提出, 印欧语言和汉语在音韵特点和正字法规则上的差异, 造成语言间位于词汇水平下的第一个语音加工单元(proximate unit, “最邻近语音单元”)存在不同。在印欧语系的字母语言中, 最邻近语音单元是音位; 在日语中, 是音拍; 而在汉语中, 是音节。但需要注意的是, O’Seaghdha并不否定汉语中音位的作用, 指出音位在音节的下一层级中提取、加工。

的确, 音节和音位加工不是对立的, 不具有排他性。目前已有研究通过实验或计算机模拟的方法为汉语言语产生中的音位加工提供了证据。我们前期的一项研究(Qu, Damian, & Kazanina, 2012, 2013)采用颜色图片命名任务, 运用ERPs技术对汉语言语产生的音韵编码单元进行探究。实验结果表明, 尽管音位重复与非重复两种条件下命名反应时的差异不显著, 但是, 在刺激呈现后200~300 ms的时间窗口内, 首音位相同条件比无关条件产生更正的波幅, 表现为音位促进效应; 而在随后300~400 ms的时间窗口内, 音位相同条件比无关条件产生显著的负波, 这一负波被解释为由于音位相同条件下同一个音位被重复提取, 引起言语内部自我监控机制的高负荷而产生的抑制作用。该研究首次为汉语言语产生中的音位加工提供了电生理证据。另外, 该研究对方法论有一定的启示。正如文章所阐述, 音位促进效应, 被随后发生的自我监控抑制效应抵消, 从而造成了该研究和以往研究中并未发现音位在行为学上的效应。换言之, 传统的行为学测量手段有很多局限性, 而脑电技术可以解决单一反应时方法的局限性(Qu et al., 2013; Qu, Zhang, & Damian, 2016)。Yu等人(Yu, Mo,& Mo, 2014) 采用内隐启动图片命名范式, 利用ERPs技术, 重复了Qu等人的发现。Roelofs (2015)采用计算机模拟技术, 构建了包含字母语言、日语、汉语在内的言语产生的计算模型。在该模型中, Roelofs提出, 尽管不同语言间的最邻近语音单元(proximate unit)不尽相同, 但是所有语言, 包括汉语, 均包含音位编码。Roelofs明确提出 “initial segments were actually prepared in the segment-only condition of the WEAVER++ simulations for Mandarin Chinese” (p.13)。这些重要的发现提示:在汉语言语产生中, 除了音节之外, 音位可能也是语音加工单元之一。

3 问题提出

综上所述, 对“言语产生中的音位加工”这一科学问题, 相对于印欧语系的字母语言, 对汉语的研究基础还非常薄弱。尚待考察的核心科学问题如下:

第一, 在汉语言语产生中, 音位加工的心理现实性及其影响因素。目前, 支持汉语音位加工心理现实性的证据还相对匮乏且存在争议, 我们将结合多种实验范式, 根据各范式自身的特点, 深入、系统地考察在词汇中处于不同位置的音位的心理现实性。另外, 音位表征背后的影响因素还尚未考察。基于汉语音位的特点, 我们提出了影响汉语音位加工的潜在因素。我们将着重澄清音位表征是否受到第二语言熟练程度、汉语拼音习得、汉语拼音使用经验的影响。阐明音位加工是否受这些因素影响, 将有助于回答音位的心理现实性是否具有普遍性的问题。

第二, 汉语言语产生中音位的加工机制。目前这一科学问题还基本处于空白状态。本项目将结合汉语的语音特点及音韵特征, 采用自然的发声命名任务, 利用脑电技术, 从音位特异性、位置特定性、音位组合方式、时间进程四个维度考察汉语音位的加工机制。

4 研究构想

(1)汉语音位的心理现实性还存在争议, 且背后的影响因素还尚待考察。

如前所述, 汉语音位的心理现实性还存在争议, 且背后的影响因素还尚待考察。另外, Qu等人(2012)Yu等人(2014)的研究只是证明了汉语言语产生中首辅音的启动效应, 尚未考察其他位置的音位(如中间元音、尾辅音)。因此, 还需要多范式、全方位、系统地进行考察。我们将分别采用内隐启动范式—操控反应词之间首音位相关或完全无关、掩蔽启动范式—灵活地操纵不同位置的音位相关性(首音位、中间音位、尾音位), 分析比较音位相关条件下和无关条件下的反应时、脑电波幅、脑电地形图, 进一步为汉语音位加工的心理现实性提供佐证。

语言加工通常受人们所使用的具体语言影响, 那么, 音位在汉语中的作用可能受到第二语言的影响, 尤其当第二语言是以音位为基本语音单位的印欧语言时, 音位在汉语中的作用受第二语言影响的可能性更大。在Qu等人(2012)的研究中, 被试为在英国读书、母语为汉语的中国留学生, 这些被试所呈现的音位表征, 很可能受到了第二语言(英语)的影响。在后续Yu等人(2014)的研究中, 采用了母语为汉语且英语不熟练的大学生为被试。但鉴于英语为大学生的必修课程, 无法完全排除第二语言(英语)的影响。一些研究提示, 第二语言(英语)的使用经验可以加强被试的音位意识。Verdonschot, Nakayama, Zhang, Tamaoka和Schiller (2013)利用掩蔽启动命名任务, 考察熟练汉-英双语者的言语产生加工单元。结果发现, 行为学反应时上的启动效应不仅发生在音节水平上, 还发生亚音节的音位组合水平上。因此, 有必要系统考察第二语言对音位表征的影响。在本项目中, 我们拟操纵颜色名称与图片名称之间的首音节相关或完全无关, 采用事件相关电位技术, 操控第二语言的熟练程度, 通过比较高、低熟练水平汉-英双语者在汉语口语产生中的音位效应, 动态地考察影响音位加工的因素。

另外, 汉语拼音广泛应用于汉字注音、汉字语音教学、计算机打字输入等领域。在汉语拼音字母体系中, 采用拉丁字母对应音位(元音、辅音), 并采用拉丁字母通用的字母表顺序。通过拼音字母可以完满地表达汉语普通话里所有的汉字读音。汉语拼音体系中字母与音位的对应关系可能增加人们以音位为单元进行语音编码。正如文中所述, 已有研究表明汉语拼音对音位意识的获得也有显著影响。该研究发现, 与学习过汉语拼音的被试相比, 没有学过汉语拼音的被试很难完成音位增加/剔除任务(Read, Zhang, Nie, & Ding, 1986)。基于此, 我们提出如下问题:汉语拼音的习得会影响汉语言语产生中的音位加工吗?在本项目中, 我们将采用颜色图片命名范式, 考察未受过汉语拼音训练的学龄前儿童和未受过汉语拼音训练的成人的音位效应, 并与接受过拼音训练的人群进行比较, 旨在考察拼音习得对音位表征的影响。另外, 汉语拼音的使用经验也可能影响音位表征。事实上, 汉语拼音的使用经验存在很大的个体差异, 这种个体差异很大程度上是由打字输入法的类型导致的:拼音输入法使用者的拼音使用频率远远大于五笔字型输入法使用者。在本项目中, 我们将比较五笔输入法和拼音输入法使用者的音位效应, 旨在考察拼音使用经验对汉语音位加工的可能影响。总之, 上述因素的潜在影响还尚未明确。阐明这些因素的影响, 将有助于回答音位的心理现实性是否具有普遍性的问题。

(2)汉语音位的加工机制还基本处于空白状态。

不用语言具有独特的语音特点, 基于汉语语音(尤其是汉语音位)特点, 我们提出关于汉语音位加工机制的四个核心研究问题:第一, 在汉语言语产生中, 音位是否具有特异性(phoneme- specific)?汉、英在辅音特点上的不同主要体现在英语中绝大多数辅音为清浊成对对比, 如/p/与/b/, /t/与/d/, /k/与/g/等。而在汉语中, 绝大多数辅音是送气与不送气的成对对比, 且绝大多数为清辅音, 只有少数几个为浊辅音(叶蜚声, 徐通锵, 2010)。例如, 汉语普通话中/b/和/p/两个音位的不同在于送气和不送气这一区别性语音特征。汉、英在元音上的不同主要体现在于音长是否是区分音位的要素, 英语中不同音长可以造成音位的区别, 如长元音/i:/与短元音/i/是两个不同的音位, 在不同的英语单词里可以造成意义的不同。而汉语不区分音位的音长, 因此音长不是区分音位的要素。

如前文所述, 在荷兰语中发现, 音位具有特异性, 即只有当音位相同时才会产生启动效应, 语音特征相似的不同音位(如:/b/, /p/)不产生显著效应。目前对汉语的音位特异性还尚未考察。在本项目中, 我们拟设置音位相同, 或音位在发音或声学相似, 或完全不同三种条件, 考察不同实验条件的音位启动效应。

第二, 音位是否具有位置特定性?位于不同位置的音位(如:奶/nai3/-谈/tan2/, 声母/n/或韵尾/n/)具有相同的表征, 还是具有位置特定性, 即位于不同位置的音位分别独立表征?与一些印欧语言相比, 汉语音位位置的规则性更强。比如, 俄语、英语允许多至三四个辅音直接结合在一起, 根据辅音的数量可将音节分为单辅音音节和辅音连缀型音节; 而汉语不允许辅音在音节里直接组合在一起。另外, 在英语中, 对出现在音节末尾的辅音限制较少, 除少数几个辅音之外, 有22个辅音可以做尾辅音。例如, 辅音/g/既可以出现在音节首(“green”)也可以出现在音节尾(“frog”)。而在汉语普通话中, 对出现在音节末尾位置上的辅音限制很严。绝大多数辅音只能出现在音节首, 不能出现在音节尾。能在音节末尾位置上出现的只有/n/、/ng/两个鼻辅音, 而/n/是唯一既可以充当音节首又可以充当音节尾的辅音。

语言间音位位置规则的灵活性可能影响音位的位置特异性, 音位位置规则性越强, 音位的位置特定性可能越强。该假设预期, 与印欧语言相比, 在汉语中, 音位更可能存在位置特定效应。来自英语的研究结果(Damian & Dumay, 2009)支持了“位置非特定假设”。在本项目中, 我们将考察汉语中音位的位置特定性, 我们将操纵图片名称之间音位重叠在词汇中所处的位置匹配(比如:均为声母/n/)或不匹配(声母/n/或韵尾/n/)。如果汉语音位具有位置特定性, 音位启动效应将只会出现在位置匹配条件中, 而不匹配条件将不会产生音位启动效应。

第三, 音位的组合方式是线性的, 还是非线性的?在一个语言中, 音位和音位之间的组合规则, 构成了这一语言的音位系统。汉语中音位的组合规则有别于印欧语系。印欧语言大多从字母文字的特性出发, 按照元音、辅音对音位进行分析归纳。而汉语普通话音位体系存在两种不同的归纳分析方法:一种以元音、辅音划分, 另一种以声母、韵母划分, 将音位归纳为声母音位、韵母音位(叶蜚声, 徐通锵, 2010)。

如果是线性组合方式, 那么在CVC音节(辅音+元音+辅音)中, 任意相邻的音位组合的心理意义是等同的, 也就是说, CV (声母+韵腹)与VC (韵腹+韵尾)的心理现实性是一样的。这种线性组合假说从根本上排除了处于音位表征层之上的声母、韵母的表征层, 取而代之的是从音位表征层直接到音节层。如果汉语中存在声母、韵母的组合方式, 那么VC (韵腹+韵尾)对语言任务的影响将大于CV (声母+韵腹), 因为VC (韵腹+韵尾)组成韵母。本项目将通过比较CV与VC重叠的启动效应, 考察音位的组合方式。

第四, 汉语各音位的时间进程是怎样的呢?就音节结构而言, 英语音节结构比较复杂, 且存在重新音节化现象。与之相比, 汉语的音节结构较为简单, 常用的无调音节数量仅400多个(不区分音调), 有调音节数量为1300多个, 每个汉字对应一个音节, 基本不存在重新音节化现象。基于音位在汉语中的重要性, O’Seaghdha等(2010)在模型中对比了汉、英语音编码的异同, 明确指出汉语词汇的语音编码起始于音节加工, 然后进行音位编码, 而在英语中, 语音编码始于音位加工。如果O’Seaghdha 模型对跨语言音节和音位发生顺序的推论是正确的, 即汉语中音节加工早于音位加工, 我们预期音节效应发生的时间窗口将早于音位效应。如果各音位之间呈序列加工, 预期首音位效应所发生的时间窗口将早于中间音位效应, 中间音位效应又将早于尾音位效应。如果为平行加工, 预期各条件下的时间窗口将无差异。

目前对汉语音韵编码的时间进程研究多集中于对音节的考察, 而针对音位加工时间进程的考察还非常少。Zhang和Zhu (2011)采用语言学任务, 结合脑电技术, 对各音位以及音位与超音位的相对时间进程进行了探讨。通过双重判断任务范式(Dual-Choice Reaction GO/NOGO Paradigm), 要求被试判断所呈现的图片名称是否包含某个音位或超音位信息, 用特定手做按键反应。该范式以N200作为神经反应信号, 发现在首辅音任务下的N200早于中间元音20~80 ms, 而中间元音和声调任务下的N200时间进程没有差异。基于此, 研究者推论:汉语各音位之间是以递增、序列的方式加工的, 而音位与超音位是平行加工的。该研究无疑为音段加工的相对时间进程提供了重要启示。但是, 该范式存在一些局限性:第一, 该范式是一个基于语言学知识的按键任务, 该任务不要求被试做出言语表达, 因此该任务能否探测自然语言产生过程存在争议(Jansma, Rodriguez-Fornells, Möller, & Münte, 2004)。第二, 一些研究提示, N200反映的是反应抑制的神经活动(Jodo & Kayama, 1992; Sasaki & Gemba, 1993), 不直接反映音韵编码的神经活动, 因此只能提供音位加工的相对时间进程, 而不是绝对时间进程。Qu等(2012)将自然发声的颜色图片命名范式与脑电技术相结合, 研究发现, 音位效应发生在200~300 ms的时间窗口, 初步为音位加工的绝对时间进程提供了启示。但该研究仅操纵了首音段, 并没有操纵其他位置的音位信息, 导致无法得知各音段、音段与音节加工的时间进程。在本项目中, 我们将利用ERP技术, 考察汉语言语产生中音节和音位效应发生的时间进程, 以及首音位、中间元音音位、末尾辅音音位加工的时间进程。

5 理论建构

目前具有重要影响的两个言语产生理论(Dell, 1986; Levelt et al., 1999), 均假设音韵编码阶段(phonological encoding)负责加工处理词汇的音节结构、韵律特征和音位等语音信息, 并且都认为音位是音韵编码的重要加工单元, 单个音节内部的音韵编码是一个增长式的编码过程, 从音节的首音开始到核心元音, 最后到尾音。以上是建立在英语与荷兰语等印欧语系上的语言产生的模型, 而对于作为非字母语言的汉语来说, 其本身具有独特的语言特点, 使得在考察汉语语音加工单元这一问题上必须结合汉语自身的特点加以研究。本项目立足于汉语语言认知, 基于汉语独特的音韵特点, 采用多种行为学范式和事件相关电位技术, 系统地考察汉语言语产生中音位加工的心理现实性。另外, 现有的言语产生理论模型侧重于言语加工过程本身, 而对说话者自身的语言经验、以及这些语言经验对语言加工机制的潜在影响关注较少。本项目以言语产生者为本, 深入考察说话者的语言习得经验和使用经验(汉语拼音和第二语言), 并探讨这些语言经验对音位加工的影响。

综上所述, 语音加工单元具有跨语言的特异性。不同语言(或方言)在音韵特征上存在着显著差异(Ladefoged, 2001; Ladefoged & Maddieson, 1996), 从而导致不同语言间语音表征的不同。在以荷兰语或英语为实验材料的内隐启动研究证实, 音韵编码是以音位为加工单元, 以从左到右、线性递增的组合方式进行的 (Meyer, 1990; Roelofs, 1996, 1998)。与印欧语言不同, 音位在日语言语产生中的作用甚微, 而音拍(mora)的作用显著 (Kureta et al., 2006)。值得注意的是, 在汉语中, 研究发现, 音节是言语产生主要的加工单元 (Chen et al., 2002; O’Seaghdha et al., 2010; You et al., 2012)。另外, 初步的研究结果提示, 在汉语言语产生中, 除了音节之外, 音位也是语音加工单元之一(Qu et al., 2012; Yu et al., 2014)。已有研究对音段在汉语言语产生中的时间进程进行了初步探索, 发现汉 语各音位之间可能是以递增、序列的方式加工 的, 而音位与超音位是平行加工的(Zhang & Zhu, 2011)。

基于上述研究成果和汉语的音韵特征, 我们提出初步的汉语言语产生音韵编码的理论框架:在汉语言语产生中, 位于词汇水平下的第一个语音加工单元(proximate unit, “最邻近语音单元”)为具有音位表征的音节。即在音韵编码阶段, 首先涉及音节的提取。值得注意的是, 在该模型中, 音节是在心理词典中存储的, 而不是在线计算的。在心理词典中存储的音节信息被提取之后, 激活该音节的声母、韵母表征层, 最后激活音位和音调信息。音位的提取是从左到右递进式进行的, 与此同时进行音调加工。整个音韵编码过程伴随着自我监控过程, 即说话者通过实时监控言语产生的整个过程, 对可能发生的语误及时予以纠正。本项目将通过实证研究对这一理论框架的主要假设加以验证、修正。在验证、修正理论框架的过程中, 也将对下一步的实证研究有所启示。

参考文献

黄伯荣, 廖序东 . ( 2011). 现代汉语(上册). 北京: 高等教育出版社.

[本文引用: 2]

叶蜚声, 徐通锵 . ( 2010). 语言学纲要 (第4版). 北京: 北京大学出版社.

[本文引用: 4]

张清芳 . ( 2008).

汉语单音节和双音节词汇产生中的音韵编码过程: 内隐启动范式研究

心理学报, 40( 3), 253-262.

[本文引用: 1]

Alario F. X., Perre L., Castel C., & Ziegler J. C. ( 2007).

The role of orthography in speech production revisited

Cognition, 102, 464-475.

DOI:10.1016/j.cognition.2006.02.002      URL     [本文引用: 1]

Baudouin de Courtenay, J. N.( 1972) . A Baudouin de Courtenay anthology; the beginnings of structural linguistics. Bloomington: Indiana University Press.

[本文引用: 1]

Blackford T., Holcomb P. J., Grainger J., & Kuperberg G. R. ( 2012).

A funny thing happened on the way to articulation: N400 attenuation despite behavioral interference in picture naming

Cognition, 123( 1), 84-99.

DOI:10.1016/j.cognition.2011.12.007      URL     [本文引用: 1]

Caramazza, A. ( 1997).

How many levels of processing are there in lexical access?

Cognitive Neuropsychology, 14, 177-208.

DOI:10.1080/026432997381664      URL     [本文引用: 2]

Chen J. Y., Chen T. M., & Dell G. S. ( 2002).

Word-form encoding in Mandarin Chinese as assessed by the implicit priming task

Journal of Memory and Language, 46, 751-781.

DOI:10.1006/jmla.2001.2825      URL     [本文引用: 2]

Word-form encoding of disyllabic words in Mandarin Chinese was investigated with the implicit priming paradigm. Experiments 1a and 1b examined implicit priming for two hypothesized units in Mandarin Chinese, the character (syllable + tone + orthography) and the syllable + tone unit. Both units produced large and comparable implicit priming effects. Experiments 2, 3, and 4 (each with two subexperiments) and three follow-up experiments examined the syllable and the tone as the hypothesized units. The results showed that the syllable-alone (i.e., tones differed) prime produced some priming, whereas the tone-alone prime (i.e., syllables differed) produced no traditional implicit priming (the effect tended to be negative). Finally, Experiment 5 examined whether the syllable onset alone was capable of producing the implicit priming effect. It was not, thus ruling out the possibility that the syllable-alone effect could just be a syllable-onset effect. Taken together, the results suggest that (1) syllable+tone is a unit in word production and orthography and/or morphology does not seem to play a role here, (2) syllable without tone can act as a separate planning unit at the phonological level, and (3) tone most likely functions like stress and constitutes part of the metrical frame in Mandarin Chinese. The results support some, but not all, assumptions of models of word-form encoding derived primarily from studies in Dutch and English.

Chomsky, N., & Halle, M. ( 1968). The sound pattern of English. New York: Harper & Row.

[本文引用: 1]

Damian, M. F., & Bowers, J. S. ( 2003).

Effects of orthography on speech production in a form-preparation paradigm

Journal of Memory and Language, 49, 119-132.

DOI:10.1016/S0749-596X(03)00008-1      URL     [本文引用: 1]

Four experiments investigated potential influences of spelling on single word speech production. A form-preparation paradigm that showed priming effects for words with initial form overlap was used to investigate whether words with form overlap, but different spelling (e.g., “camel”-“kidney”) also show priming. Experiment 1 demonstrated that such words did not benefit from the form overlap, suggesting that the incongruent spelling disrupted the form-preparation effect. Experiment 2 replicated the first experiment with an independent set of items and an improved design, and once again showed a disruptive effect of spelling. To divert participants’ attention from the spelling of the targets, Experiment 3 was conducted entirely in the auditory domain, but yielded the same outcome as before. Experiment 4 showed that matching initial letters alone, in the absence of matching sounds (e.g., “cycle”-“cobra”), did not produce priming. These findings raise the possibility that orthographic codes are mandatorily activated in speech production by literate speakers.

Damian, M. F., & Dumay, N. ( 2007).

Time pressure and phonological advance planning in spoken production

Journal of Memory and Language, 57, 195-209.

DOI:10.1016/j.jml.2006.11.001      URL     [本文引用: 2]

Current accounts of spoken production debate the extent to which speakers plan ahead. Here, we investigated whether the scope of phonological planning is influenced by changes in time pressure constraints. The first experiment used a picture–word interference task and showed that picture naming latencies were shorter when word distractors shared the final segments with the picture name. Experiment 2 used the same paradigm but with colored pictures to elicit determiner02+02adjective02+02noun phrases. Latencies were shorter when the distractor overlapped phonologically with the picture name. Finally, Experiment 3 demonstrated that in colored picture naming without distractors, latencies were shorter when the object noun began with the same phoneme as the color adjective. Crucially, in all experiments introduction of a response deadline accelerated latencies, but did not alter the relative magnitude of the priming effects. In sum, pressure to provide a swift response does not reduce the scope of phonological planning.

Damian, M. F., & Dumay, N. ( 2009).

Exploring phonological encoding through repeated segments

Language and Cognitive Processes, 24, 685-712.

DOI:10.1080/01690960802351260      URL     [本文引用: 5]

Five experiments explored the influence of repeated phonemes on the production of short utterances. In Experiment 1 coloured object naming showed faster latencies when colour and object started with the same phoneme (‘green goat’) than when they did not; the opposite was found when colour and object were named on consecutive trials (‘green’ – ‘goat’). Experiments 2 and 3 focused on adjective-noun phrases and showed no effect of repeated phonemes on either acoustical duration of speeded responses, or latencies in a delayed variant of the task, suggesting a higher-level – rather than articulatory – locus of the effect. Experiments 4 and 5 demonstrated that the facilitation induced by repeated segments is not specific to word onset (‘green chain’) and is independent of whether or not the repeated phonemes occupy the same within-word position (‘green flag’). These results indicate that in the production of multiple words, word forms are concurrently activated and evoke phonological segments represented in a position-nonspecific manner.

Dell, G. S. ( 1984).

Representation of serial order in speech: Evidence from the repeated phoneme effect in speech errors

Journal of Experimental Psychology: Learning, Memory, and Cognition, 10( 2), 222-233.

DOI:10.1037/0278-7393.10.2.222      URL     [本文引用: 1]

Dell, G. S. ( 1986).

A spreading-activation theory of retrieval in sentence production

Psychological Review, 96, 283-321.

DOI:10.1037/0033-295X.93.3.283      URL     PMID:3749399      [本文引用: 5]

A theory of sentence production is presented that accounts for facts about speech errors, including (1) the kinds of errors that occur; (2) the constraints on their form; and (3) the conditions that precipitate them. Two simulation models are introduced to illustrate how the theory applies to phonological encoding processes. (Author/LMO)

Dell, G. S. ( 1988).

The retrieval of phonological forms in production: Tests of predictions from a connectionist model

Journal of Memory and Language, 27, 124-142.

DOI:10.1016/0749-596X(88)90070-8      URL     [本文引用: 2]

Dell'Acqua R., Sessa P., Peressotti F., Mulatti C., Navarrete E., & Grainger J. ( 2010).

ERP evidence for ultra-fast semantic processing in the picture-word interference paradigm

Frontiers in Psychology, 1, 177.

DOI:10.3389/fpsyg.2010.00177      URL     PMID:3153787      [本文引用: 1]

We used the event-related potential (ERP) approach combined with a subtraction technique to explore the timecourse of activation of semantic and phonological representations in the picture–word interference paradigm. Subjects were exposed to to-be-named pictures superimposed on to-be-ignored semantically related, phonologically related, or unrelated words, and distinct ERP waveforms were generated time-locked to these different classes of stimuli. Difference ERP waveforms were generated in the semantic condition and in the phonological condition by subtracting ERP activity associated with unrelated picture–word stimuli from ERP activity associated with related picture–word stimuli. We measured both latency and amplitude of these difference ERP waveforms in a pre-articulatory time-window. The behavioral results showed standard interference effects in the semantic condition, and facilitatory effects in the phonological condition. The ERP results indicated a bimodal distribution of semantic effects, characterized by the extremely rapid onset (at about 10065ms) of a primary component followed by a later, distinct, component. Phonological effects in ERPs were characterized by components with later onsets and distinct scalp topography of ERP sources relative to semantic ERP components. Regression analyses revealed a covariation between semantic and phonological behavioral effect sizes and ERP component amplitudes, and no covariation between the behavioral effects and ERP component latency. The early effect of semantic distractors is thought to reflect very fast access to semantic representations from picture stimuli modulating on-going orthographic processing of distractor words.

Garrett M. F. ( 1980).

The limits of accommodation: Arguments for independent processing levels in sentence production

In V. A. Fromkin (Ed.), Errors in linguistic performance: Slips of the tongue, ear, pen, and hand (pp. 263-271). New York: Academic Press.

URL     [本文引用: 1]

Indefrey, P., & Levelt, W. J. M. ( 2004).

The spatial and temporal signatures of word production components

Cognition, 92, 101-144.

DOI:10.1016/j.cognition.2002.06.001      URL     PMID:15037128      [本文引用: 1]

This paper presents the results of a comprehensive meta-analysis of the relevant imaging literature on word production (82 experiments). In addition to the spatial overlap of activated regions, we also analyzed the available data on the time course of activations. The analysis specified regions and time windows of activation for the core processes of word production: lexical selection, phonological code retrieval, syllabification, and phonetic/articulatory preparation. A comparison of the word production results with studies on auditory word/non-word perception and reading showed that the time course of activations in word production is, on the whole, compatible with the temporal constraints that perception processes impose on the production processes they affect in picture/word interference paradigms.

Jansma B. M., Rodriguez-Fornells, A, Möller J., & Münte T. F. ( 2004).

Electrophysiological studies of speech production

In T. Pechmann & C. Habel (Eds.), Multidisciplinary approaches to language production (Trends in linguistics)( pp. 361-395). Berlin, Germany: Mouton de Gruyter.

URL     [本文引用: 1]

react-text: 380 Humans need to supervise and adjust their own behavior by means of an error detection and correction system as well as by using externally available information. The purpose of the present study was to compare the electrophysiological effects related to self-generated internal and to external (feedback) information used for performance monitoring. Fourteen young normal subjects learned to... /react-text react-text: 381 /react-text [Show full abstract]

Jescheniak J. D., Hahne A., & Schriefers H. ( 2003).

Information flow in the mental lexicon during speech planning: Evidence from event-related brain potentials

Cognitive Brain Research, 15( 3), 261-276.

DOI:10.1016/S0926-6410(02)00198-2      URL     [本文引用: 1]

Jodo, E., & Kayama, Y. ( 1992).

Relation of a negative ERP component to response inhibition in a go/no-go task

Electroencephalography and Clinical Neurophysiology, 82, 477-482.

DOI:10.1016/0013-4694(92)90054-L      URL     PMID:1375556      [本文引用: 1]

Previous studies have suggested that a negative component (N2) of the event-related potential (ERP), whose weak latency is 200 300 msec after stimulus onset, may vary in amplitude depending on the neuronal activity required for response inhibition. To confirm this, ERPs were recorded in a Go/No-go paradigm in which subjects of one group (HI, n = 10) were asked to respond to Go stimuli with key pressing within a shorter period (< 300 msec) than those of the other group (LI, n = 10) whose upper limit of the reaction time was relatively longer (< 500 msec). All subjects had to withhold the Go response to the No-go stimuli without making overt muscle activities. The N2 component was recorded superposed on the initial descending limb of the P300 and other slow deflections, which were attenuated with a digital filter to measure the amplitude of N2. The N2 amplitude was significantly larger to the No-go stimulus than to the Go stimulus in both groups, but the N2 to the No-go stimulus was significantly larger in the HI group than in the LI group. These differences in N2 amplitude between conditions or groups were thought to be independent of other ERP components such as P300 and CNV. These results suggest that at least to some extent N2, which increased in amplitude when a greater effort was required to withhold the Go response, reflects the activity of a response inhibition system of the brain.

Kureta Y., Fushimi T., & Tatsumi I. F. ( 2006).

The functional unit in phonological encoding: Evidence for moraic representation in native Japanese speakers

Journal of Experimental Psychology: Learning, Memory, and Cognition, 32( 5), 1102-1119.

DOI:10.1037/0278-7393.32.5.1102      URL     [本文引用: 2]

Ladefoged, P. ( 2001). Vowels and consonants: An introduction to the sounds of languages. Oxford: Blackwell.

[本文引用: 2]

Ladefoged, P., & Maddieson, I. ( 1996). The sounds of the world’s languages. Oxford: Blackwell Publishers.

[本文引用: 2]

Levelt W. J. M. ( 1989) . Speaking: From intention to articulation. Cambridge, MA: MIT Press.

[本文引用: 1]

Levelt W. J. M., Roelofs A., & Meyer A. S. ( 1999).

A theory of lexical access in speech production

The Behavioral and Brain Sciences, 22, 1-38.

DOI:10.1017/S0140525X99001776      URL     PMID:11301520      [本文引用: 3]

Abstract Preparing words in speech production is normally a fast and accurate process. We generate them two or three per second in fluent conversation; and overtly naming a clear picture of an object can easily be initiated within 600 msec after picture onset. The underlying process, however, is exceedingly complex. The theory reviewed in this target article analyzes this process as staged and feed-forward. After a first stage of conceptual preparation, word generation proceeds through lexical selection, morphological and phonological encoding, phonetic encoding, and articulation itself. In addition, the speaker exerts some degree of output control, by monitoring of self-produced internal and overt speech. The core of the theory, ranging from lexical selection to the initiation of phonetic encoding, is captured in a computational model, called WEAVER++. Both the theory and the computational model have been developed in interaction with reaction time experiments, particularly in picture naming or related word production paradigms, with the aim of accounting for the real-time processing in normal word production. A comprehensive review of theory, model, and experiments is presented. The model can handle some of the main observations in the domain of speech errors (the major empirical domain for most other theories of lexical access), and the theory opens new ways of approaching the cerebral organization of speech production by way of high-temporal-resolution imaging.

MacKay, D. G. ( 1972).

The structure of words and syllables: Evidence from errors in speech

Cognitive Psychology, 3, 210-227.

DOI:10.1016/0010-0285(72)90004-7      URL     [本文引用: 1]

This study examines syllabic and morphological determinants of synonymic intrusions such as BEHORTMENT, an inadvertent combination of BEHAVIOR and DEPORTMENT. Statistical analyses of 133 synonymic intrusions in German suggested that syllables are composed of at least three subunits: segments (consonants and vowels), consonant clusters, and a subunit consisting of vowel and final consonant(s). Similar analyses of 46 synonymic intrusions in English suggested that mechanisms underlying this class of error may be universal or common to all speakers. A hierarchic model of the serial order of speech was advanced to explain the structure of words and syllables suggested by these findings. Independent support for the model was noted in the rules governing abbreviations, Pig Latin, poetic rhyme, and other types of errors in speech.

Meringer, R., & Mayer, C. ( 1895). Versprechen und Verlesen: Eine psychologisch-linguistische Studie. Stuttgart, Germany: Göschen’sche Verlagshandlung.

[本文引用: 1]

Meyer, A. S. ( 1990).

The time course of phonological encoding in language production: The encoding of successive syllables of a word

Journal of Memory and Language, 29, 524-545.

DOI:10.1016/0749-596X(90)90050-A      URL     [本文引用: 2]

Meyer, A. S. ( 1991).

The time course of phonological encoding in language production: Phonological encoding inside a syllable

Journal of Memory and Language, 30, 69-89.

DOI:10.1016/0749-596X(91)90011-8      URL     [本文引用: 2]

O'Seaghdha P. G., Chen J. Y., & Chen T. M. ( 2010).

Proximate units in word production: Phonological encoding begins with syllables in Mandarin Chinese but with segments in English

Cognition, 115, 282-302.

DOI:10.1016/j.cognition.2010.01.001      URL     [本文引用: 5]

Qu Q. Q., Damian M. F., & Kazanina N. ( 2012).

Sound- sized segments are significant for Mandarin speakers

Proceedings of the National Academy of Sciences of the United States of America, 109, 14265-14270.

DOI:10.1073/pnas.1200632109      URL     PMID:22891321      [本文引用: 5]

Do speakers of all languages use segmental speech sounds when they produce words? Existing models of language production generally assume a mental representation of individual segmental units, or phonemes, but the bulk of evidence comes from speakers of European languages in which the orthographic system codes explicitly for speech sounds. By contrast in languages with nonalphabetical scripts, such as Mandarin Chinese, individual speech sounds are not orthographically represented, raising the possibility that speakers of these languages do not use phonemes as fundamental processing units. We used event-related potentials (ERPs) combined with behavioral measurement to investigate the role of phonemes in Mandarin production. Mandarin native speakers named colored line drawings of objects using color adjective-noun phrases; color and object name either shared the initial phoneme or were phonologically unrelated. Whereas naming latencies were unaffected by phoneme repetition, ERP responses were modulated from 200 ms after picture onset. Our ERP findings thus provide strong support for the claim that phonemic segments constitute fundamental units of phonological encoding even for speakers of languages that do not encode such units orthographically.

Qu Q. Q., Damian M. F., & Kazanina N. ( 2013).

Reply to O’Seaghdha et al.: Primary phonological planning units in Chinese are phonemically specified

Proceedings of the National Academy of Sciences of the United States of America, 110( 1), E4.

DOI:10.1073/pnas.1217601110      URL     [本文引用: 2]

Qu Q. Q., Zhang Q. F., & Damian M. F. ( 2016).

Tracking the time course of lexical access in orthographic production: An event-related potential study of word frequency effects in written picture naming

Brain and Language, 159, 118-126.

DOI:10.1016/j.bandl.2016.06.008      URL     [本文引用: 1]

Rapp, B., & Goldrick, M. ( 2000).

Discreteness and interactivity in spoken word production

Psychological Review, 107, 460-499.

DOI:10.1037//0033-295X.107.3.460      URL     PMID:10941277      [本文引用: 1]

Five theories of spoken word production that differ along the discreteness-interactivity dimension are evaluated. Specifically examined is the role that cascading activation, feedback, seriality, and interaction domains play in accounting for a set of fundamental observations derived from patterns of speech errors produced by normal and brain-damaged individuals. After reviewing the evidence from normal speech errors, case studies of 3 brain-damaged individuals with acquired naming deficits are presented. The patterns these individuals exhibit provide important constraints on theories of spoken naming. With the help of computer simulations of the 5 theories, the authors evaluate the extent to which the error patterns predicted by each theory conform with the empirical facts. The results support a theory of spoken word production that, although interactive, places important restrictions on the extent and locus of interactivity.

Rayner K., White S. J., Johnson R. L., & Liversedge S. P. ( 2006).

Raeding wrods with jubmled lettres: There is a cost

Psychological Science, 17( 3), 192-193.

DOI:10.1111/j.1467-9280.2006.01684.x      URL     [本文引用: 1]

A spatial cuing task was used to identify two types of readers, those with a relatively fast and those with a relatively slow buildup of inhibition of return (IOR). Backward-directed eye movements (regressions) during sentence reading were then examined as a function of the two IOR types. The results revealed that readers with fast IOR executed larger regressions than readers with slow IOR, as they directed the eyes away from the most recently attended area of text. Forward-directed eye movements (saccades), by contrast, were not a function of IOR type. Ease of sentence comprehension influenced the size of regressions, but this effect was also independent of IOR type. Multiple mechanisms of spatial attention, including IOR, bias eye movements toward upcoming words in the text during reading.

Read C., Zhang Y., Nie H. Y., & Ding B. Q. ( 1986).

The ability to manipulate speech sounds depends on knowing alphabetic spelling

Cognition, 24, 31-44.

DOI:10.1016/0010-0277(86)90003-X      URL     PMID:3791920      [本文引用: 1]

Chinese adults literate only in Chinese characters could not add or delete individual consonants in spoken Chinese words. A comparable group of adults, literate in alphabetic spelling as well as characters, could perform the same tasks readily and accurately. The two groups were similar in education and experience but differed in age and consequently in whether they had learned an alphabetic writing system in school. Even adults who had once learned alphabetic writing but were no longer able to use it were able to manipulate speech sounds in this way. This “segmentation” skill, which has been shown to contribute to skilled reading and writing, does not develop with cognitive maturation, non-alphabetic literacy, or exposure to a language rich in rhymes

Roelofs, A. ( 1996).

Serial order in planning the production of successive morphemes of a word

Journal of Memory and Language, 35, 854-876.

DOI:10.1006/jmla.1996.0044      URL     [本文引用: 2]

Five implicit priming experiments examined whether the speech production system can plan noninitial morphemes of a word in advance of initial ones. On each trial, subjects had to produce one word out of a set of three words as quickly as possible. In a homogeneous condition, the responses shared part of their form, whereas in a heterogeneous condition they did not. The first experiment shows that the task is sensitive to morphological planning. In producing disyllabic simple and compound nouns, a larger facilitatory effect was obtained when a shared initial syllable constituted a morpheme than when it did not. The next three experiments suggest that successive morphemes are planned in serial order. In producing nominal compounds, no facilitation was obtained for noninitial morphemes. In producing prefixed verbs, facilitation was obtained for the prefix but not for the noninitial base. Sharing morphemes often implies semantic overlap. The fifth experiment shows that semantic similarity per se yields inhibition rather than facilitation. Computer simulations show that the WEAVER model of word-form encoding (Roelofs, 1992b, 1994, submitted-a) accounts for the findings.

Roelofs, A. ( 1997).

The WEAVER model of word-form encoding in speech production

Cognition, 64, 249-284.

DOI:10.1016/S0010-0277(97)00027-9      URL     PMID:9426503      [本文引用: 1]

Lexical access in speaking consists of two major steps: lemma retrieval and word-form encoding. In Roelofs (Roelofs, A. 1992a. Cognition 42, 107 142; Roelofs, A. 1993. Cognition 47, 59 87.), I described a model of lemma retrieval. The present paper extends this work by presenting a comprehensive model of the second access step, word-form encoding. The model is called WEAVER (Word-form Encoding by Activation and VERification). Unlike other models of word-form generation, WEAVER is able to provide accounts of response time data, particularly from the picture-word interference paradigm and the implicit priming paradigm. Its key features are (1) retrieval by spreading activation, (2) verification of activated information by a production rule, (3) a rightward incremental construction of phonological representations using a principle of active syllabification, syllables are constructed on the fly rather than stored with lexical items, (4) active competitive selection of syllabic motor programs using a mathematical formalism that generates response times and (5) the association of phonological speech errors with the selection of syllabic motor programs due to the failure of verification.

Roelofs, A. ( 1998).

Rightward incrementality in encoding simple phrasal forms in speech production: Verb-particle combinations

Journal of Experimental Psychology: Learning, Memory, and Cognition, 24, 904-921.

DOI:10.1037/0278-7393.24.4.904      URL     [本文引用: 2]

Roelofs, A. ( 2015).

Modeling of phonological encoding in spoken word production: From Germanic languages to Mandarin Chinese and Japanese

Japanese Psychological Research, 57, 22-37.

[本文引用: 1]

Sasaki K. , & Gemba, H.( 1993) .

Prefrontal cortex in the organization and control of voluntary movement

In T. Ono, L. R. Squire, M. E. Raiche, D. I. Perrett, & M. Fukuda (Eds.), Brain mechanisms of perception and memory: From neuron to behavior (pp. 473-496). New York: Oxford University Press.

[本文引用: 1]

Shattuck-Hufnagel, S. ( 1979).

Speech errors as evidence for a serial-ordering mechanism in sentence production

In W. E. Cooper & E. C. T. Walker (Eds.), Sentence processing: Psycholinguistic studies presented to Merrill Garrett( pp. 295-342). Hillsdale, NJ: Erlbaum.

URL     [本文引用: 2]

Shattuck-Hufnagel, S. ( 1983).

Sublexical units and suprasegmental structure in speech production planning

In P. F. MacNeilage (Ed.), The production of speech ( pp. 109-136). New York: Springer.

DOI:10.1007/978-1-4613-8202-7_6      URL     [本文引用: 1]

A number of elements have been suggested as units of sublexical processing during planning for speech production, some derived from grammatical theory and some from observed variations and constancies in the acoustic and articulatory patterns of speech. Candidates range from muscle-group control mechanisms to distinctive features, individual phonemic segments, diphones, demisyllables, syllable onsets and rhymes, and even syllables themselves. Proposals vary widely, partly because different levels of processing are being modeled, but also because the production planning process is highly complex, and our understanding of its many aspects is still quite rudimentary. To cite just a few areas where our models are particularly primitive, little is known about the planning mechanisms that might impose serial order on abstractly represented units, those that integrate adjacent elements with each other, those that coordinate all of the factors influencing segment duration, and those that compute motor commands; even less is known about the relationships among such possible processing components.

Trubetzkoy, N. ( 1969). Principles of phonology. Berkeley: University of California Press.

[本文引用: 1]

van Turennout M., Hagoort P., & Brown C. M. ( 1997).

Electrophysiological evidence on the time course of semantic and phonological processes in speech production

Journal of Experimental Psychology: Learning, Memory, and Cognition, 23( 4), 787-806.

DOI:10.1037/0278-7393.23.4.787      URL     [本文引用: 1]

Verdonschot R. G., Nakayama M., Zhang Q. F., Tamaoka K., & Schiller N. O. ( 2013).

The proximate phonological unit of Chinese-English bilinguals: Proficiency matters

PLoS One, 8( 4), e61454.

DOI:10.1371/journal.pone.0061454      URL     PMID:3640013      [本文引用: 1]

中国科学院机构知识库(CAS IR GRID)以发展机构知识能力和知识管理能力为目标,快速实现对本机构知识资产的收集、长期保存、合理传播利用,积极建设对知识内容进行捕获、转化、传播、利用和审计的能力,逐步建设包括知识内容分析、关系分析和能力审计在内的知识服务能力,开展综合知识管理。

White S. J., Johnson R. L., Liversedge S. P., & Rayner K. ( 2008).

Eye movements when reading transposed text: The importance of word-beginning letters

Journal of Experimental Psychology: Human Perception and Performance, 34( 5), 1261-1276.

DOI:10.1037/0096-1523.34.5.1261      URL     PMID:18823209      [本文引用: 2]

Participants' eye movements were recorded as they read sentences with words containing transposed adjacent letters. Transpositions were either external (e.g., problme, rpoblem) or internal (e.g., porblem, probelm) and at either the beginning (e.g., rpoblem, porblem) or end (e.g., problme, probelm) of words. The results showed disruption for words with transposed letters compared to the normal baseline condition, and the greatest disruption was observed for word-initial transpositions. In Experiment 1, transpositions within low frequency words led to longer reading times than when letters were transposed within high frequency words. Experiment 2 demonstrated that the position of word-initial letters is most critical even when parafoveal preview of words to the right of fixation is unavailable. The findings have important implications for the roles of different letter positions in word recognition and the effects of parafoveal preview on word recognition processes.

You W. P., Zhang Q. F., & Verdonschot R. G. ( 2012).

Masked syllable priming effects in word and picture naming in Chinese

PLoS One, 7( 10), e46595.

DOI:10.1371/journal.pone.0046595      URL     PMID:23056360      [本文引用: 2]

Four experiments investigated the role of the syllable in Chinese spoken word production. Chen, Chen and Ferrand (2003) reported a syllable priming effect when primes and targets shared the first syllable using a masked priming paradigm in Chinese. Our Experiment 1 was a direct replication of Chen et al.’s (2003) Experiment 3 employing CV (e.g., ,/ba2.ying2/, strike camp) and CVG (e.g., ,/bai2.shou3/, white haired) syllable types. Experiment 2 tested the syllable priming effect using different syllable types: e.g., CV ( ,/qi4.qiu2/, balloon) and CVN ( ,/qing1.ting2/, dragonfly). Experiment 3 investigated this issue further using line drawings of common objects as targets that were preceded either by a CV (e.g., ,/qi3/, attempt), or a CVN (e.g., ,/qing2/, affection) prime. Experiment 4 further examined the priming effect by a comparison between CV or CVN priming and an unrelated priming condition using CV-NX (e.g., ,/mi2.ni3/, mini) and CVN-CX (e.g., ,/min2.ju1/, dwellings) as target words. These four experiments consistently found that CV targets were named faster when preceded by CV primes than when they were preceded by CVG, CVN or unrelated primes, whereas CVG or CVN targets showed the reverse pattern. These results indicate that the priming effect critically depends on the match between the structure of the prime and that of the first syllable of the target. The effect obtained in this study was consistent across different stimuli and different tasks (word and picture naming), and provides more conclusive and consistent data regarding the role of the syllable in Chinese speech production.

Yu M. X., Mo C., & Mo L. ( 2014).

The role of phoneme in Mandarin Chinese production: Evidence from ERPs

PLoS One, 9( 9), e106486.

DOI:10.1371/journal.pone.0106486      URL     PMID:25191857      [本文引用: 3]

Established linguistic theoretical frameworks propose that alphabetic language speakers use phonemes as phonological encoding units during speech production whereas Mandarin Chinese speakers use syllables. This framework was challenged by recent neural evidence of facilitation induced by overlapping initial phonemes, raising the possibility that phonemes also contribute to the phonological encoding process in Chinese. However, there is no evidence of non-initial phoneme involvement in Chinese phonological encoding among representative Chinese speakers, rendering the functional role of phonemes in spoken Chinese controversial. Here, we addressed this issue by systematically investigating the word-initial and non-initial phoneme repetition effect on the electrophysiological signal using a picture-naming priming task in which native Chinese speakers produced disyllabic word pairs. We found that overlapping phonemes in both the initial and non-initial position evoked more positive ERPs in the 180- to 300-ms interval, indicating position-invariant repetition facilitation effect during phonological encoding. Our findings thus revealed the fundamental role of phonemes as independent phonological encoding units in Mandarin Chinese.

Zhang, Q. F., & Zhu, X. B. ( 2011).

The temporal and spatial features of segmental and suprasegmental encoding during implicit picture naming: An event-related potential study

Neuropsychologia, 49, 3813-3825.

DOI:10.1016/j.neuropsychologia.2011.09.040      URL     [本文引用: 2]

/


版权所有 © 《心理科学进展》编辑部
地址:北京市朝阳区林萃路16号院 
邮编:100101 
电话:010-64850861 
E-mail:jinzhan@psych.ac.cn
备案编号:京ICP备10049795号-1 京公网安备110402500018号

本系统由北京玛格泰克科技发展有限公司设计开发