心理学报 ›› 2022, Vol. 54 ›› Issue (9): 1122-1136.doi: 10.3724/SP.J.1041.2022.01122
收稿日期:
2021-08-05
发布日期:
2022-07-21
出版日期:
2022-09-25
通讯作者:
喻晓锋
E-mail:xyu6@jxnu.edu.cn
基金资助:
TONG Hao1, YU Xiaofeng1(), QIN Chunying2, PENG Yafeng1, ZHONG Xiaoyuan1
Received:
2021-08-05
Online:
2022-07-21
Published:
2022-09-25
Contact:
YU Xiaofeng
E-mail:xyu6@jxnu.edu.cn
摘要:
本文提出一种多级计分项目下的个人拟合统计量R, 考察它在检测6种常见的异常作答模式(作弊、猜测、随机、粗心、创新作答、混合异常)下的表现, 并与标准化对数似然统计量lzp进行比较。结果表明:(1) 在异常作答覆盖率较低并且异常作答类型为作弊和猜测时, R的检测率显著高于lzp; (2) 随着测验长度和被试异常程度的增加, 两种统计量的检测率都会上升; (3) 在一些条件下, R与lzp检测效果接近。实证数据分析进一步展示了R统计量的使用方法和过程, 结果也表明R统计量具有较好的应用前景。
中图分类号:
童昊, 喻晓锋, 秦春影, 彭亚风, 钟小缘. (2022). 多级计分测验中基于残差统计量的被试拟合研究. 心理学报, 54(9), 1122-1136.
TONG Hao, YU Xiaofeng, QIN Chunying, PENG Yafeng, ZHONG Xiaoyuan. (2022). Detection of aberrant response patterns using a residual-based statistic in testing with polytomous items. Acta Psychologica Sinica, 54(9), 1122-1136.
项目数 | 一类错误率 | | |
---|---|---|---|
20 | 0.01 | 706.9 | -2.215 |
0.025 | 416.2 | -1.770 | |
0.05 | 282.9 | -1.399 | |
40 | 0.01 | 1057.4 | -2.176 |
0.025 | 691.9 | -1.760 | |
0.05 | 519.6 | -1.417 | |
60 | 0.01 | 1407.7 | -2.125 |
0.025 | 949.2 | -1.717 | |
0.05 | 730.4 | -1.383 | |
80 | 0.01 | 1904.7 | -2.127 |
0.025 | 1278.3 | -1.738 | |
0.05 | 983.6 | -1.411 |
表1 lzp和R在测验长度为20, 40, 60, 80项目下的截断点值α = 0.05, α = 0.025, α = 0.01 (单侧)
项目数 | 一类错误率 | | |
---|---|---|---|
20 | 0.01 | 706.9 | -2.215 |
0.025 | 416.2 | -1.770 | |
0.05 | 282.9 | -1.399 | |
40 | 0.01 | 1057.4 | -2.176 |
0.025 | 691.9 | -1.760 | |
0.05 | 519.6 | -1.417 | |
60 | 0.01 | 1407.7 | -2.125 |
0.025 | 949.2 | -1.717 | |
0.05 | 730.4 | -1.383 | |
80 | 0.01 | 1904.7 | -2.127 |
0.025 | 1278.3 | -1.738 | |
0.05 | 983.6 | -1.411 |
异常类型 | 定义 | 操作定义 |
---|---|---|
作弊 | 能力较低的被试在平均难度较高的项目上获得满分 | 随机挑选低能力被试 |
幸运猜测 | 能力较低的被试在平均难度较高的项目上依靠猜测获得满分 | 随机挑选低能力被试 |
随机作答 | 所有能力范围内的被试都有可能出现, 有一定概率获得0分 | 随机挑选被试, 随机抽取n题, 有0.8的概率得0分, 0.2的概率维持原作答 |
粗心 | 能力较高的被试在平均难度较低的项目上有一定概率获得0分 | 随机挑选高能力被试 |
创造性作答 | 能力较高的被试在最容易的项目上获得0分 | 随机挑选高能力被试 |
混合 | 将以上异常情况进行混合 | 以上5种情况各占异常被试总体的五分之一 |
表2 异常的测验行为定义及其操作定义
异常类型 | 定义 | 操作定义 |
---|---|---|
作弊 | 能力较低的被试在平均难度较高的项目上获得满分 | 随机挑选低能力被试 |
幸运猜测 | 能力较低的被试在平均难度较高的项目上依靠猜测获得满分 | 随机挑选低能力被试 |
随机作答 | 所有能力范围内的被试都有可能出现, 有一定概率获得0分 | 随机挑选被试, 随机抽取n题, 有0.8的概率得0分, 0.2的概率维持原作答 |
粗心 | 能力较高的被试在平均难度较低的项目上有一定概率获得0分 | 随机挑选高能力被试 |
创造性作答 | 能力较高的被试在最容易的项目上获得0分 | 随机挑选高能力被试 |
混合 | 将以上异常情况进行混合 | 以上5种情况各占异常被试总体的五分之一 |
异常 类型 | 临界值对应 一类错误率 | 异常程度低(0.1) | 异常程度中(0.25) | 异常程度高(0.5) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
虚警率 | 检测率 | 虚警率 | 检测率 | 虚警率 | 检测率 | ||||||||
R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | ||
作弊 | 0.01 | 0.010 | 0.011 | 0.499 | 0.224 | 0.009 | 0.011 | 0.592 | 0.937 | 0.010 | 0.010 | 0.714 | 0.994 |
0.025 | 0.024 | 0.026 | 0.729 | 0.400 | 0.023 | 0.026 | 0.827 | 0.973 | 0.024 | 0.026 | 0.901 | 0.998 | |
0.05 | 0.049 | 0.052 | 0.889 | 0.581 | 0.047 | 0.051 | 0.957 | 0.989 | 0.048 | 0.052 | 0.972 | 1 | |
幸运 猜测 | 0.01 | 0.010 | 0.011 | 0.124 | 0.031 | 0.009 | 0.011 | 0.230 | 0.096 | 0.010 | 0.011 | 0.457 | 0.295 |
0.025 | 0.025 | 0.026 | 0.196 | 0.067 | 0.024 | 0.026 | 0.362 | 0.165 | 0.024 | 0.026 | 0.579 | 0.396 | |
0.05 | 0.049 | 0.052 | 0.262 | 0.119 | 0.048 | 0.052 | 0.472 | 0.243 | 0.048 | 0.052 | 0.673 | 0.487 | |
随机 作答 | 0.01 | 0.010 | 0.010 | 0.138 | 0.068 | 0.010 | 0.010 | 0.181 | 0.205 | 0.010 | 0.011 | 0.173 | 0.387 |
0.025 | 0.025 | 0.025 | 0.201 | 0.120 | 0.025 | 0.025 | 0.272 | 0.279 | 0.025 | 0.026 | 0.321 | 0.465 | |
0.05 | 0.050 | 0.050 | 0.270 | 0.185 | 0.050 | 0.050 | 0.363 | 0.354 | 0.051 | 0.051 | 0.463 | 0.535 | |
粗心 | 0.01 | 0.010 | 0.011 | 0.826 | 0.491 | 0.009 | 0.011 | 0.832 | 0.887 | 0.009 | 0.011 | 0.646 | 0.995 |
0.025 | 0.024 | 0.026 | 0.914 | 0.632 | 0.025 | 0.026 | 0.952 | 0.934 | 0.024 | 0.026 | 0.907 | 0.998 | |
0.05 | 0.050 | 0.051 | 0.946 | 0.736 | 0.051 | 0.052 | 0.985 | 0.961 | 0.050 | 0.052 | 0.989 | 0.999 | |
创造性 作答 | 0.01 | 0.009 | 0.011 | 0.922 | 0.704 | 0.010 | 0.011 | 0.839 | 0.998 | 0.009 | 0.011 | 0.653 | 1 |
0.025 | 0.024 | 0.026 | 0.989 | 0.854 | 0.025 | 0.026 | 0.966 | 1 | 0.025 | 0.026 | 0.957 | 1 | |
0.05 | 0.050 | 0.052 | 0.999 | 0.939 | 0.051 | 0.052 | 0.995 | 1 | 0.050 | 0.051 | 0.997 | 1 | |
混合 | 0.01 | 0.010 | 0.010 | 0.459 | 0.299 | 0.010 | 0.011 | 0.464 | 0.600 | 0.010 | 0.012 | 0.430 | 0.659 |
0.025 | 0.025 | 0.025 | 0.552 | 0.405 | 0.025 | 0.026 | 0.585 | 0.638 | 0.025 | 0.028 | 0.596 | 0.688 | |
0.05 | 0.050 | 0.051 | 0.615 | 0.495 | 0.051 | 0.051 | 0.661 | 0.669 | 0.051 | 0.054 | 0.678 | 0.715 |
表3 测验长度为20个项目时R和lzp的一类错误率和检测率
异常 类型 | 临界值对应 一类错误率 | 异常程度低(0.1) | 异常程度中(0.25) | 异常程度高(0.5) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
虚警率 | 检测率 | 虚警率 | 检测率 | 虚警率 | 检测率 | ||||||||
R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | ||
作弊 | 0.01 | 0.010 | 0.011 | 0.499 | 0.224 | 0.009 | 0.011 | 0.592 | 0.937 | 0.010 | 0.010 | 0.714 | 0.994 |
0.025 | 0.024 | 0.026 | 0.729 | 0.400 | 0.023 | 0.026 | 0.827 | 0.973 | 0.024 | 0.026 | 0.901 | 0.998 | |
0.05 | 0.049 | 0.052 | 0.889 | 0.581 | 0.047 | 0.051 | 0.957 | 0.989 | 0.048 | 0.052 | 0.972 | 1 | |
幸运 猜测 | 0.01 | 0.010 | 0.011 | 0.124 | 0.031 | 0.009 | 0.011 | 0.230 | 0.096 | 0.010 | 0.011 | 0.457 | 0.295 |
0.025 | 0.025 | 0.026 | 0.196 | 0.067 | 0.024 | 0.026 | 0.362 | 0.165 | 0.024 | 0.026 | 0.579 | 0.396 | |
0.05 | 0.049 | 0.052 | 0.262 | 0.119 | 0.048 | 0.052 | 0.472 | 0.243 | 0.048 | 0.052 | 0.673 | 0.487 | |
随机 作答 | 0.01 | 0.010 | 0.010 | 0.138 | 0.068 | 0.010 | 0.010 | 0.181 | 0.205 | 0.010 | 0.011 | 0.173 | 0.387 |
0.025 | 0.025 | 0.025 | 0.201 | 0.120 | 0.025 | 0.025 | 0.272 | 0.279 | 0.025 | 0.026 | 0.321 | 0.465 | |
0.05 | 0.050 | 0.050 | 0.270 | 0.185 | 0.050 | 0.050 | 0.363 | 0.354 | 0.051 | 0.051 | 0.463 | 0.535 | |
粗心 | 0.01 | 0.010 | 0.011 | 0.826 | 0.491 | 0.009 | 0.011 | 0.832 | 0.887 | 0.009 | 0.011 | 0.646 | 0.995 |
0.025 | 0.024 | 0.026 | 0.914 | 0.632 | 0.025 | 0.026 | 0.952 | 0.934 | 0.024 | 0.026 | 0.907 | 0.998 | |
0.05 | 0.050 | 0.051 | 0.946 | 0.736 | 0.051 | 0.052 | 0.985 | 0.961 | 0.050 | 0.052 | 0.989 | 0.999 | |
创造性 作答 | 0.01 | 0.009 | 0.011 | 0.922 | 0.704 | 0.010 | 0.011 | 0.839 | 0.998 | 0.009 | 0.011 | 0.653 | 1 |
0.025 | 0.024 | 0.026 | 0.989 | 0.854 | 0.025 | 0.026 | 0.966 | 1 | 0.025 | 0.026 | 0.957 | 1 | |
0.05 | 0.050 | 0.052 | 0.999 | 0.939 | 0.051 | 0.052 | 0.995 | 1 | 0.050 | 0.051 | 0.997 | 1 | |
混合 | 0.01 | 0.010 | 0.010 | 0.459 | 0.299 | 0.010 | 0.011 | 0.464 | 0.600 | 0.010 | 0.012 | 0.430 | 0.659 |
0.025 | 0.025 | 0.025 | 0.552 | 0.405 | 0.025 | 0.026 | 0.585 | 0.638 | 0.025 | 0.028 | 0.596 | 0.688 | |
0.05 | 0.050 | 0.051 | 0.615 | 0.495 | 0.051 | 0.051 | 0.661 | 0.669 | 0.051 | 0.054 | 0.678 | 0.715 |
异常 类型 | 临界值对应 一类错误率 | 异常程度低(0.1) | 异常程度中(0.25) | 异常程度高(0.5) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
虚警率 | 检测率 | 虚警率 | 检测率 | 虚警率 | 检测率 | ||||||||
R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | ||
作弊 | 0.01 | 0.009 | 0.011 | 0.589 | 0.326 | 0.009 | 0.011 | 0.905 | 1 | 0.009 | 0.011 | 0.799 | 1 |
0.025 | 0.022 | 0.027 | 0.794 | 0.516 | 0.023 | 0.027 | 0.991 | 1 | 0.022 | 0.027 | 0.954 | 1 | |
0.05 | 0.046 | 0.053 | 0.925 | 0.676 | 0.046 | 0.053 | 1 | 1 | 0.046 | 0.053 | 0.997 | 1 | |
幸运 猜测 | 0.01 | 0.009 | 0.011 | 0.140 | 0.025 | 0.009 | 0.011 | 0.522 | 0.193 | 0.009 | 0.011 | 0.591 | 0.432 |
0.025 | 0.023 | 0.026 | 0.205 | 0.059 | 0.023 | 0.027 | 0.617 | 0.292 | 0.023 | 0.026 | 0.733 | 0.552 | |
0.05 | 0.047 | 0.054 | 0.274 | 0.107 | 0.046 | 0.053 | 0.682 | 0.389 | 0.046 | 0.052 | 0.822 | 0.653 | |
随机 作答 | 0.01 | 0.010 | 0.010 | 0.127 | 0.072 | 0.010 | 0.010 | 0.197 | 0.295 | 0.010 | 0.010 | 0.168 | 0.438 |
0.025 | 0.025 | 0.025 | 0.187 | 0.130 | 0.025 | 0.026 | 0.291 | 0.378 | 0.025 | 0.025 | 0.329 | 0.510 | |
0.05 | 0.050 | 0.050 | 0.255 | 0.199 | 0.049 | 0.050 | 0.393 | 0.454 | 0.050 | 0.051 | 0.487 | 0.575 | |
粗心 | 0.01 | 0.010 | 0.011 | 0.576 | 0.472 | 0.010 | 0.010 | 0.891 | 0.982 | 0.011 | 0.010 | 0.677 | 0.995 |
0.025 | 0.025 | 0.026 | 0.770 | 0.624 | 0.026 | 0.026 | 0.979 | 0.992 | 0.025 | 0.025 | 0.920 | 0.998 | |
0.05 | 0.050 | 0.051 | 0.898 | 0.735 | 0.050 | 0.051 | 0.995 | 0.996 | 0.050 | 0.050 | 0.985 | 0.999 | |
创造性 作答 | 0.01 | 0.010 | 0.010 | 0.639 | 0.695 | 0.010 | 0.010 | 0.886 | 1 | 0.010 | 0.010 | 0.813 | 1 |
0.025 | 0.025 | 0.026 | 0.840 | 0.828 | 0.025 | 0.026 | 0.987 | 1 | 0.025 | 0.025 | 0.973 | 1 | |
0.05 | 0.050 | 0.052 | 0.952 | 0.906 | 0.050 | 0.051 | 0.999 | 1 | 0.050 | 0.050 | 0.994 | 1 | |
混合 | 0.01 | 0.010 | 0.010 | 0.372 | 0.317 | 0.010 | 0.012 | 0.549 | 0.637 | 0.010 | 0.012 | 0.488 | 0.668 |
0.025 | 0.025 | 0.025 | 0.503 | 0.423 | 0.025 | 0.029 | 0.620 | 0.663 | 0.025 | 0.029 | 0.617 | 0.701 | |
0.05 | 0.049 | 0.051 | 0.595 | 0.510 | 0.051 | 0.055 | 0.660 | 0.689 | 0.051 | 0.056 | 0.688 | 0.731 |
表4 测验长度为40个项目时R和lzp的一类错误率和检测率
异常 类型 | 临界值对应 一类错误率 | 异常程度低(0.1) | 异常程度中(0.25) | 异常程度高(0.5) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
虚警率 | 检测率 | 虚警率 | 检测率 | 虚警率 | 检测率 | ||||||||
R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | ||
作弊 | 0.01 | 0.009 | 0.011 | 0.589 | 0.326 | 0.009 | 0.011 | 0.905 | 1 | 0.009 | 0.011 | 0.799 | 1 |
0.025 | 0.022 | 0.027 | 0.794 | 0.516 | 0.023 | 0.027 | 0.991 | 1 | 0.022 | 0.027 | 0.954 | 1 | |
0.05 | 0.046 | 0.053 | 0.925 | 0.676 | 0.046 | 0.053 | 1 | 1 | 0.046 | 0.053 | 0.997 | 1 | |
幸运 猜测 | 0.01 | 0.009 | 0.011 | 0.140 | 0.025 | 0.009 | 0.011 | 0.522 | 0.193 | 0.009 | 0.011 | 0.591 | 0.432 |
0.025 | 0.023 | 0.026 | 0.205 | 0.059 | 0.023 | 0.027 | 0.617 | 0.292 | 0.023 | 0.026 | 0.733 | 0.552 | |
0.05 | 0.047 | 0.054 | 0.274 | 0.107 | 0.046 | 0.053 | 0.682 | 0.389 | 0.046 | 0.052 | 0.822 | 0.653 | |
随机 作答 | 0.01 | 0.010 | 0.010 | 0.127 | 0.072 | 0.010 | 0.010 | 0.197 | 0.295 | 0.010 | 0.010 | 0.168 | 0.438 |
0.025 | 0.025 | 0.025 | 0.187 | 0.130 | 0.025 | 0.026 | 0.291 | 0.378 | 0.025 | 0.025 | 0.329 | 0.510 | |
0.05 | 0.050 | 0.050 | 0.255 | 0.199 | 0.049 | 0.050 | 0.393 | 0.454 | 0.050 | 0.051 | 0.487 | 0.575 | |
粗心 | 0.01 | 0.010 | 0.011 | 0.576 | 0.472 | 0.010 | 0.010 | 0.891 | 0.982 | 0.011 | 0.010 | 0.677 | 0.995 |
0.025 | 0.025 | 0.026 | 0.770 | 0.624 | 0.026 | 0.026 | 0.979 | 0.992 | 0.025 | 0.025 | 0.920 | 0.998 | |
0.05 | 0.050 | 0.051 | 0.898 | 0.735 | 0.050 | 0.051 | 0.995 | 0.996 | 0.050 | 0.050 | 0.985 | 0.999 | |
创造性 作答 | 0.01 | 0.010 | 0.010 | 0.639 | 0.695 | 0.010 | 0.010 | 0.886 | 1 | 0.010 | 0.010 | 0.813 | 1 |
0.025 | 0.025 | 0.026 | 0.840 | 0.828 | 0.025 | 0.026 | 0.987 | 1 | 0.025 | 0.025 | 0.973 | 1 | |
0.05 | 0.050 | 0.052 | 0.952 | 0.906 | 0.050 | 0.051 | 0.999 | 1 | 0.050 | 0.050 | 0.994 | 1 | |
混合 | 0.01 | 0.010 | 0.010 | 0.372 | 0.317 | 0.010 | 0.012 | 0.549 | 0.637 | 0.010 | 0.012 | 0.488 | 0.668 |
0.025 | 0.025 | 0.025 | 0.503 | 0.423 | 0.025 | 0.029 | 0.620 | 0.663 | 0.025 | 0.029 | 0.617 | 0.701 | |
0.05 | 0.049 | 0.051 | 0.595 | 0.510 | 0.051 | 0.055 | 0.660 | 0.689 | 0.051 | 0.056 | 0.688 | 0.731 |
异常 类型 | 临界值对应 一类错误率 | 异常程度低(0.1) | 异常程度中(0.25) | 异常程度高(0.5) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
虚警率 | 检测率 | 虚警率 | 检测率 | 虚警率 | 检测率 | ||||||||
R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | ||
作弊 | 0.01 | 0.009 | 0.010 | 0.930 | 0.790 | 0.009 | 0.011 | 0.975 | 1 | 0.009 | 0.011 | 0.868 | 1 |
0.025 | 0.024 | 0.025 | 0.996 | 0.904 | 0.023 | 0.026 | 1 | 1 | 0.024 | 0.026 | 0.998 | 1 | |
0.05 | 0.047 | 0.050 | 1 | 0.957 | 0.048 | 0.052 | 1 | 1 | 0.048 | 0.051 | 1 | 1 | |
幸运 猜测 | 0.01 | 0.009 | 0.011 | 0.298 | 0.053 | 0.009 | 0.010 | 0.513 | 0.230 | 0.009 | 0.010 | 0.590 | 0.484 |
0.025 | 0.023 | 0.026 | 0.366 | 0.105 | 0.023 | 0.026 | 0.618 | 0.341 | 0.023 | 0.026 | 0.722 | 0.605 | |
0.05 | 0.047 | 0.051 | 0.426 | 0.176 | 0.047 | 0.051 | 0.701 | 0.445 | 0.048 | 0.051 | 0.820 | 0.702 | |
随机 作答 | 0.01 | 0.010 | 0.009 | 0.275 | 0.162 | 0.010 | 0.010 | 0.375 | 0.429 | 0.010 | 0.010 | 0.430 | 0.655 |
0.025 | 0.025 | 0.024 | 0.343 | 0.248 | 0.025 | 0.025 | 0.461 | 0.514 | 0.025 | 0.025 | 0.576 | 0.721 | |
0.05 | 0.050 | 0.049 | 0.414 | 0.333 | 0.050 | 0.050 | 0.549 | 0.590 | 0.050 | 0.049 | 0.696 | 0.775 | |
粗心 | 0.01 | 0.010 | 0.011 | 0.968 | 0.881 | 0.009 | 0.011 | 0.987 | 0.998 | 0.009 | 0.010 | 0.974 | 1 |
0.025 | 0.024 | 0.026 | 0.975 | 0.929 | 0.023 | 0.027 | 0.995 | 0.999 | 0.024 | 0.027 | 0.994 | 1 | |
0.05 | 0.048 | 0.052 | 0.981 | 0.956 | 0.047 | 0.052 | 0.999 | 1 | 0.048 | 0.052 | 0.999 | 1 | |
创造性 作答 | 0.01 | 0.010 | 0.010 | 1 | 0.999 | 0.009 | 0.011 | 1 | 1 | 0.010 | 0.011 | 1 | 1 |
0.025 | 0.024 | 0.026 | 1 | 1 | 0.023 | 0.026 | 1 | 1 | 0.024 | 0.027 | 1 | 1 | |
0.05 | 0.048 | 0.053 | 1 | 1 | 0.047 | 0.052 | 1 | 1 | 0.048 | 0.052 | 1 | 1 | |
混合 | 0.01 | 0.010 | 0.011 | 0.602 | 0.563 | 0.010 | 0.011 | 0.660 | 0.683 | 0.010 | 0.013 | 0.658 | 0.756 |
0.025 | 0.025 | 0.026 | 0.625 | 0.612 | 0.025 | 0.028 | 0.679 | 0.710 | 0.026 | 0.030 | 0.724 | 0.776 | |
0.05 | 0.051 | 0.052 | 0.645 | 0.649 | 0.053 | 0.055 | 0.703 | 0.736 | 0.053 | 0.057 | 0.755 | 0.791 |
表5 测验长度为60个项目时R和lzp的一类错误率和检测率
异常 类型 | 临界值对应 一类错误率 | 异常程度低(0.1) | 异常程度中(0.25) | 异常程度高(0.5) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
虚警率 | 检测率 | 虚警率 | 检测率 | 虚警率 | 检测率 | ||||||||
R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | ||
作弊 | 0.01 | 0.009 | 0.010 | 0.930 | 0.790 | 0.009 | 0.011 | 0.975 | 1 | 0.009 | 0.011 | 0.868 | 1 |
0.025 | 0.024 | 0.025 | 0.996 | 0.904 | 0.023 | 0.026 | 1 | 1 | 0.024 | 0.026 | 0.998 | 1 | |
0.05 | 0.047 | 0.050 | 1 | 0.957 | 0.048 | 0.052 | 1 | 1 | 0.048 | 0.051 | 1 | 1 | |
幸运 猜测 | 0.01 | 0.009 | 0.011 | 0.298 | 0.053 | 0.009 | 0.010 | 0.513 | 0.230 | 0.009 | 0.010 | 0.590 | 0.484 |
0.025 | 0.023 | 0.026 | 0.366 | 0.105 | 0.023 | 0.026 | 0.618 | 0.341 | 0.023 | 0.026 | 0.722 | 0.605 | |
0.05 | 0.047 | 0.051 | 0.426 | 0.176 | 0.047 | 0.051 | 0.701 | 0.445 | 0.048 | 0.051 | 0.820 | 0.702 | |
随机 作答 | 0.01 | 0.010 | 0.009 | 0.275 | 0.162 | 0.010 | 0.010 | 0.375 | 0.429 | 0.010 | 0.010 | 0.430 | 0.655 |
0.025 | 0.025 | 0.024 | 0.343 | 0.248 | 0.025 | 0.025 | 0.461 | 0.514 | 0.025 | 0.025 | 0.576 | 0.721 | |
0.05 | 0.050 | 0.049 | 0.414 | 0.333 | 0.050 | 0.050 | 0.549 | 0.590 | 0.050 | 0.049 | 0.696 | 0.775 | |
粗心 | 0.01 | 0.010 | 0.011 | 0.968 | 0.881 | 0.009 | 0.011 | 0.987 | 0.998 | 0.009 | 0.010 | 0.974 | 1 |
0.025 | 0.024 | 0.026 | 0.975 | 0.929 | 0.023 | 0.027 | 0.995 | 0.999 | 0.024 | 0.027 | 0.994 | 1 | |
0.05 | 0.048 | 0.052 | 0.981 | 0.956 | 0.047 | 0.052 | 0.999 | 1 | 0.048 | 0.052 | 0.999 | 1 | |
创造性 作答 | 0.01 | 0.010 | 0.010 | 1 | 0.999 | 0.009 | 0.011 | 1 | 1 | 0.010 | 0.011 | 1 | 1 |
0.025 | 0.024 | 0.026 | 1 | 1 | 0.023 | 0.026 | 1 | 1 | 0.024 | 0.027 | 1 | 1 | |
0.05 | 0.048 | 0.053 | 1 | 1 | 0.047 | 0.052 | 1 | 1 | 0.048 | 0.052 | 1 | 1 | |
混合 | 0.01 | 0.010 | 0.011 | 0.602 | 0.563 | 0.010 | 0.011 | 0.660 | 0.683 | 0.010 | 0.013 | 0.658 | 0.756 |
0.025 | 0.025 | 0.026 | 0.625 | 0.612 | 0.025 | 0.028 | 0.679 | 0.710 | 0.026 | 0.030 | 0.724 | 0.776 | |
0.05 | 0.051 | 0.052 | 0.645 | 0.649 | 0.053 | 0.055 | 0.703 | 0.736 | 0.053 | 0.057 | 0.755 | 0.791 |
异常 类型 | 临界值对应 一类错误率 | 异常程度低(0.1) | 异常程度中(0.25) | 异常程度高(0.5) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
虚警率 | 检测率 | 虚警率 | 检测率 | 虚警率 | 检测率 | ||||||||
R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | ||
作弊 | 0.01 | 0.010 | 0.011 | 1 | 0.963 | 0.009 | 0.011 | 0.985 | 1 | 0.009 | 0.011 | 0.985 | 1 |
0.025 | 0.024 | 0.027 | 1 | 0.986 | 0.024 | 0.027 | 1 | 1 | 0.024 | 0.027 | 1 | 1 | |
0.05 | 0.048 | 0.052 | 1 | 0.995 | 0.047 | 0.053 | 1 | 1 | 0.047 | 0.053 | 1 | 1 | |
幸运 猜测 | 0.01 | 0.009 | 0.011 | 0.316 | 0.082 | 0.010 | 0.011 | 0.504 | 0.284 | 0.010 | 0.011 | 0.620 | 0.677 |
0.025 | 0.024 | 0.026 | 0.400 | 0.148 | 0.024 | 0.026 | 0.639 | 0.404 | 0.024 | 0.027 | 0.780 | 0.776 | |
0.05 | 0.048 | 0.052 | 0.490 | 0.227 | 0.048 | 0.052 | 0.749 | 0.515 | 0.048 | 0.053 | 0.889 | 0.844 | |
随机 作答 | 0.01 | 0.010 | 0.010 | 0.178 | 0.133 | 0.010 | 0.010 | 0.242 | 0.406 | 0.010 | 0.010 | 0.195 | 0.561 |
0.025 | 0.025 | 0.025 | 0.250 | 0.207 | 0.025 | 0.025 | 0.344 | 0.481 | 0.025 | 0.024 | 0.387 | 0.625 | |
0.05 | 0.050 | 0.050 | 0.325 | 0.288 | 0.050 | 0.050 | 0.450 | 0.548 | 0.049 | 0.049 | 0.562 | 0.683 | |
粗心 | 0.01 | 0.009 | 0.011 | 0.962 | 0.912 | 0.009 | 0.011 | 0.954 | 1 | 0.009 | 0.011 | 0.797 | 1 |
0.025 | 0.023 | 0.027 | 0.991 | 0.953 | 0.023 | 0.027 | 0.996 | 1 | 0.024 | 0.027 | 0.980 | 1 | |
0.05 | 0.046 | 0.053 | 0.997 | 0.975 | 0.046 | 0.053 | 1 | 1 | 0.047 | 0.052 | 0.999 | 1 | |
创造性 作答 | 0.01 | 0.009 | 0.011 | 0.999 | 0.995 | 0.009 | 0.011 | 0.960 | 1 | 0.009 | 0.011 | 0.956 | 1 |
0.025 | 0.023 | 0.026 | 1 | 0.999 | 0.023 | 0.027 | 0.999 | 1 | 0.023 | 0.027 | 0.999 | 1 | |
0.05 | 0.047 | 0.053 | 1 | 1 | 0.046 | 0.053 | 1 | 1 | 0.046 | 0.053 | 1 | 1 | |
混合 | 0.01 | 0.010 | 0.011 | 0.600 | 0.595 | 0.010 | 0.012 | 0.589 | 0.677 | 0.010 | 0.014 | 0.568 | 0.724 |
0.025 | 0.025 | 0.026 | 0.616 | 0.623 | 0.025 | 0.029 | 0.633 | 0.707 | 0.026 | 0.032 | 0.644 | 0.752 | |
0.05 | 0.051 | 0.053 | 0.635 | 0.651 | 0.052 | 0.055 | 0.673 | 0.733 | 0.053 | 0.060 | 0.704 | 0.775 |
表6 测验长度为80个项目时R和lzp的一类错误率和检测率
异常 类型 | 临界值对应 一类错误率 | 异常程度低(0.1) | 异常程度中(0.25) | 异常程度高(0.5) | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
虚警率 | 检测率 | 虚警率 | 检测率 | 虚警率 | 检测率 | ||||||||
R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | R | lzp | ||
作弊 | 0.01 | 0.010 | 0.011 | 1 | 0.963 | 0.009 | 0.011 | 0.985 | 1 | 0.009 | 0.011 | 0.985 | 1 |
0.025 | 0.024 | 0.027 | 1 | 0.986 | 0.024 | 0.027 | 1 | 1 | 0.024 | 0.027 | 1 | 1 | |
0.05 | 0.048 | 0.052 | 1 | 0.995 | 0.047 | 0.053 | 1 | 1 | 0.047 | 0.053 | 1 | 1 | |
幸运 猜测 | 0.01 | 0.009 | 0.011 | 0.316 | 0.082 | 0.010 | 0.011 | 0.504 | 0.284 | 0.010 | 0.011 | 0.620 | 0.677 |
0.025 | 0.024 | 0.026 | 0.400 | 0.148 | 0.024 | 0.026 | 0.639 | 0.404 | 0.024 | 0.027 | 0.780 | 0.776 | |
0.05 | 0.048 | 0.052 | 0.490 | 0.227 | 0.048 | 0.052 | 0.749 | 0.515 | 0.048 | 0.053 | 0.889 | 0.844 | |
随机 作答 | 0.01 | 0.010 | 0.010 | 0.178 | 0.133 | 0.010 | 0.010 | 0.242 | 0.406 | 0.010 | 0.010 | 0.195 | 0.561 |
0.025 | 0.025 | 0.025 | 0.250 | 0.207 | 0.025 | 0.025 | 0.344 | 0.481 | 0.025 | 0.024 | 0.387 | 0.625 | |
0.05 | 0.050 | 0.050 | 0.325 | 0.288 | 0.050 | 0.050 | 0.450 | 0.548 | 0.049 | 0.049 | 0.562 | 0.683 | |
粗心 | 0.01 | 0.009 | 0.011 | 0.962 | 0.912 | 0.009 | 0.011 | 0.954 | 1 | 0.009 | 0.011 | 0.797 | 1 |
0.025 | 0.023 | 0.027 | 0.991 | 0.953 | 0.023 | 0.027 | 0.996 | 1 | 0.024 | 0.027 | 0.980 | 1 | |
0.05 | 0.046 | 0.053 | 0.997 | 0.975 | 0.046 | 0.053 | 1 | 1 | 0.047 | 0.052 | 0.999 | 1 | |
创造性 作答 | 0.01 | 0.009 | 0.011 | 0.999 | 0.995 | 0.009 | 0.011 | 0.960 | 1 | 0.009 | 0.011 | 0.956 | 1 |
0.025 | 0.023 | 0.026 | 1 | 0.999 | 0.023 | 0.027 | 0.999 | 1 | 0.023 | 0.027 | 0.999 | 1 | |
0.05 | 0.047 | 0.053 | 1 | 1 | 0.046 | 0.053 | 1 | 1 | 0.046 | 0.053 | 1 | 1 | |
混合 | 0.01 | 0.010 | 0.011 | 0.600 | 0.595 | 0.010 | 0.012 | 0.589 | 0.677 | 0.010 | 0.014 | 0.568 | 0.724 |
0.025 | 0.025 | 0.026 | 0.616 | 0.623 | 0.025 | 0.029 | 0.633 | 0.707 | 0.026 | 0.032 | 0.644 | 0.752 | |
0.05 | 0.051 | 0.053 | 0.635 | 0.651 | 0.052 | 0.055 | 0.673 | 0.733 | 0.053 | 0.060 | 0.704 | 0.775 |
异常类型 | 异常程度 | M = 20 | M = 40 | M = 60 | M = 80 | ||||
---|---|---|---|---|---|---|---|---|---|
R | lzp | R | lzp | R | lzp | R | lzp | ||
作弊 | 低(0.1) | 0.979 | 0.921 | 0.986 | 0.937 | 0.997 | 0.989 | 0.999 | 0.996 |
中(0.25) | 0.987 | 0.995 | 0.996 | 0.998 | 0.997 | 0.998 | 0.998 | 0.998 | |
高(0.5) | 0.989 | 0.998 | 0.994 | 0.998 | 0.995 | 0.998 | 0.997 | 0.998 | |
幸运猜测 | 低(0.1) | 0.643 | 0.598 | 0.689 | 0.610 | 0.771 | 0.669 | 0.828 | 0.709 |
中(0.25) | 0.775 | 0.704 | 0.891 | 0.795 | 0.919 | 0.829 | 0.940 | 0.860 | |
高(0.5) | 0.888 | 0.827 | 0.961 | 0.908 | 0.964 | 0.925 | 0.979 | 0.962 | |
随机作答 | 低(0.1) | 0.696 | 0.649 | 0.699 | 0.671 | 0.780 | 0.752 | 0.752 | 0.724 |
中(0.25) | 0.756 | 0.736 | 0.780 | 0.800 | 0.848 | 0.861 | 0.825 | 0.842 | |
高(0.5) | 0.807 | 0.834 | 0.808 | 0.846 | 0.902 | 0.927 | 0.867 | 0.898 | |
粗心 | 低(0.1) | 0.973 | 0.930 | 0.979 | 0.940 | 0.996 | 0.988 | 0.998 | 0.993 |
中(0.25) | 0.993 | 0.989 | 0.996 | 0.997 | 0.999 | 0.998 | 0.998 | 0.998 | |
高(0.5) | 0.990 | 0.998 | 0.990 | 0.998 | 0.998 | 0.998 | 0.994 | 0.998 | |
创造性作答 | 低(0.1) | 0.997 | 0.985 | 0.987 | 0.980 | 0.999 | 0.998 | 0.999 | 0.998 |
中(0.25) | 0.994 | 0.998 | 0.996 | 0.998 | 0.999 | 0.998 | 0.997 | 0.998 | |
高(0.5) | 0.991 | 0.998 | 0.993 | 0.998 | 0.998 | 0.998 | 0.997 | 0.998 | |
混合 | 低(0.1) | 0.827 | 0.794 | 0.824 | 0.803 | 0.845 | 0.837 | 0.846 | 0.838 |
中(0.25) | 0.854 | 0.849 | 0.854 | 0.859 | 0.872 | 0.877 | 0.873 | 0.877 | |
高(0.5) | 0.866 | 0.870 | 0.865 | 0.876 | 0.885 | 0.891 | 0.880 | 0.888 |
表7 R和lzp的AUC值
异常类型 | 异常程度 | M = 20 | M = 40 | M = 60 | M = 80 | ||||
---|---|---|---|---|---|---|---|---|---|
R | lzp | R | lzp | R | lzp | R | lzp | ||
作弊 | 低(0.1) | 0.979 | 0.921 | 0.986 | 0.937 | 0.997 | 0.989 | 0.999 | 0.996 |
中(0.25) | 0.987 | 0.995 | 0.996 | 0.998 | 0.997 | 0.998 | 0.998 | 0.998 | |
高(0.5) | 0.989 | 0.998 | 0.994 | 0.998 | 0.995 | 0.998 | 0.997 | 0.998 | |
幸运猜测 | 低(0.1) | 0.643 | 0.598 | 0.689 | 0.610 | 0.771 | 0.669 | 0.828 | 0.709 |
中(0.25) | 0.775 | 0.704 | 0.891 | 0.795 | 0.919 | 0.829 | 0.940 | 0.860 | |
高(0.5) | 0.888 | 0.827 | 0.961 | 0.908 | 0.964 | 0.925 | 0.979 | 0.962 | |
随机作答 | 低(0.1) | 0.696 | 0.649 | 0.699 | 0.671 | 0.780 | 0.752 | 0.752 | 0.724 |
中(0.25) | 0.756 | 0.736 | 0.780 | 0.800 | 0.848 | 0.861 | 0.825 | 0.842 | |
高(0.5) | 0.807 | 0.834 | 0.808 | 0.846 | 0.902 | 0.927 | 0.867 | 0.898 | |
粗心 | 低(0.1) | 0.973 | 0.930 | 0.979 | 0.940 | 0.996 | 0.988 | 0.998 | 0.993 |
中(0.25) | 0.993 | 0.989 | 0.996 | 0.997 | 0.999 | 0.998 | 0.998 | 0.998 | |
高(0.5) | 0.990 | 0.998 | 0.990 | 0.998 | 0.998 | 0.998 | 0.994 | 0.998 | |
创造性作答 | 低(0.1) | 0.997 | 0.985 | 0.987 | 0.980 | 0.999 | 0.998 | 0.999 | 0.998 |
中(0.25) | 0.994 | 0.998 | 0.996 | 0.998 | 0.999 | 0.998 | 0.997 | 0.998 | |
高(0.5) | 0.991 | 0.998 | 0.993 | 0.998 | 0.998 | 0.998 | 0.997 | 0.998 | |
混合 | 低(0.1) | 0.827 | 0.794 | 0.824 | 0.803 | 0.845 | 0.837 | 0.846 | 0.838 |
中(0.25) | 0.854 | 0.849 | 0.854 | 0.859 | 0.872 | 0.877 | 0.873 | 0.877 | |
高(0.5) | 0.866 | 0.870 | 0.865 | 0.876 | 0.885 | 0.891 | 0.880 | 0.888 |
题号 | a | b1 | b2 | 题号 | a | b1 | b2 |
---|---|---|---|---|---|---|---|
1 | 0.859 | 0.393 | 2.199 | 11 | 0.759 | -0.544 | 1.262 |
2 | 0.655 | 0.668 | 2.474 | 12 | 0.661 | 0.336 | 2.142 |
3 | 0.889 | 0.741 | 2.547 | 13 | 0.962 | 0.445 | 2.251 |
4 | 0.499 | -0.778 | 1.028 | 14 | 0.729 | 0.740 | 2.546 |
5 | 0.814 | -0.345 | 1.461 | 15 | 0.652 | -0.155 | 1.651 |
6 | 1.136 | 0.272 | 2.078 | 16 | 1.314 | 0.064 | 1.870 |
7 | 0.813 | -0.211 | 1.595 | 17 | 0.863 | 0.596 | 2.402 |
8 | 0.503 | -1.028 | 0.778 | 18 | 0.827 | 0.023 | 1.829 |
9 | 0.917 | 1.424 | 3.230 | 19 | 0.859 | 1.794 | 3.600 |
10 | 0.831 | 0.946 | 2.752 | - | - | - | - |
表8 全国青少年健康纵向研究(1994~1995)拟合GRM模型项目参数
题号 | a | b1 | b2 | 题号 | a | b1 | b2 |
---|---|---|---|---|---|---|---|
1 | 0.859 | 0.393 | 2.199 | 11 | 0.759 | -0.544 | 1.262 |
2 | 0.655 | 0.668 | 2.474 | 12 | 0.661 | 0.336 | 2.142 |
3 | 0.889 | 0.741 | 2.547 | 13 | 0.962 | 0.445 | 2.251 |
4 | 0.499 | -0.778 | 1.028 | 14 | 0.729 | 0.740 | 2.546 |
5 | 0.814 | -0.345 | 1.461 | 15 | 0.652 | -0.155 | 1.651 |
6 | 1.136 | 0.272 | 2.078 | 16 | 1.314 | 0.064 | 1.870 |
7 | 0.813 | -0.211 | 1.595 | 17 | 0.863 | 0.596 | 2.402 |
8 | 0.503 | -1.028 | 0.778 | 18 | 0.827 | 0.023 | 1.829 |
9 | 0.917 | 1.424 | 3.230 | 19 | 0.859 | 1.794 | 3.600 |
10 | 0.831 | 0.946 | 2.752 | - | - | - | - |
类型 | 被试序号 | 作答向量 |
---|---|---|
共同 | 15 | (2,1,0,0,0,0,0,0,0,0,0,2,0,0,2,0,0,0,0) |
23 | (0,1,0,2,1,0,0,1,2,0,2,2,1,0,2,2,2,0,2) | |
48 | (2,0,2,2,1,2,0,2,2,1,2,1,2,2,2,2,2,0,2) | |
49 | (1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2) | |
50 | (2,2,2,0,2,2,2,1,0,0,1,2,0,0,1,0,0,0,0) | |
R | 43 | (0,0,2,2,0,0,1,1,0,1,1,0,0,0,0,0,0,1,0) |
45 | (0,0,0,2,1,2,0,0,0,1,0,0,0,0,0,1,0,0,0) | |
99 | (1,0,0,1,0,1,1,0,0,2,1,2,0,0,0,0,1,1,0) | |
108 | (0,0,0,1,0,0,0,2,0,0,1,0,0,0,2,2,0,0,0) | |
114 | (0,2,0,0,1,0,0,0,0,0,0,0,0,0,2,0,0,0,0) | |
lzp | 6 | (1,0,0,1,2,0,2,0,1,0,2,1,0,0,1,0,1,0,1) |
36 | (0,0,2,2,1,2,0,2,1,0,2,1,2,0,2,1,0,0,0) | |
52 | (1,0,2,2,2,0,2,0,1,2,2,1,2,0,2,2,2,1,0) | |
55 | (2,2,2,1,2,2,1,2,0,2,1,0,2,0,1,2,0,1,0) | |
101 | (0,1,0,1,0,2,1,2,1,2,2,0,1,0,2,1,1,1,1) |
表9 标记为异常的被试作答模式
类型 | 被试序号 | 作答向量 |
---|---|---|
共同 | 15 | (2,1,0,0,0,0,0,0,0,0,0,2,0,0,2,0,0,0,0) |
23 | (0,1,0,2,1,0,0,1,2,0,2,2,1,0,2,2,2,0,2) | |
48 | (2,0,2,2,1,2,0,2,2,1,2,1,2,2,2,2,2,0,2) | |
49 | (1,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,0,0,2) | |
50 | (2,2,2,0,2,2,2,1,0,0,1,2,0,0,1,0,0,0,0) | |
R | 43 | (0,0,2,2,0,0,1,1,0,1,1,0,0,0,0,0,0,1,0) |
45 | (0,0,0,2,1,2,0,0,0,1,0,0,0,0,0,1,0,0,0) | |
99 | (1,0,0,1,0,1,1,0,0,2,1,2,0,0,0,0,1,1,0) | |
108 | (0,0,0,1,0,0,0,2,0,0,1,0,0,0,2,2,0,0,0) | |
114 | (0,2,0,0,1,0,0,0,0,0,0,0,0,0,2,0,0,0,0) | |
lzp | 6 | (1,0,0,1,2,0,2,0,1,0,2,1,0,0,1,0,1,0,1) |
36 | (0,0,2,2,1,2,0,2,1,0,2,1,2,0,2,1,0,0,0) | |
52 | (1,0,2,2,2,0,2,0,1,2,2,1,2,0,2,2,2,1,0) | |
55 | (2,2,2,1,2,2,1,2,0,2,1,0,2,0,1,2,0,1,0) | |
101 | (0,1,0,1,0,2,1,2,1,2,2,0,1,0,2,1,1,1,1) |
[1] |
Buchanan, T., & Smith, J. L. (1999). Using the internet for psychological research: Personality testing on the world wide web. British Journal of Psychology, 90(1), 125-144.
doi: 10.1348/000712699161189 URL |
[2] | Chen, Q., Ding, S., Zhu, L., & Xu, Z. (2010). Three-parameter graded response model and its parameter estimation. Journal of Jiangxi Normal University (Natural Science), 34(2), 117-122. |
[陈青, 丁树良, 朱隆尹, 许志勇. (2010). 三参数等级反应模型及其参数估计. 江西师范大学学报(自然科学版), 34(2), 117-122.] | |
[3] | Cheng, X., Ding, S., Zhu, L., & Wu, H. (2012). The stratified item selection strategy with maximal information under graded response model. Journal of Jiangxi Normal University (Natural Science), 36(5), 117-122. |
[程小扬, 丁树良, 朱隆尹, 巫华芳. (2012). 等级评分模型下的最大信息量分层选题策略. 江西师范大学学报(自然科学版), 36(5), 446-451.] | |
[4] | Cooperman, A. W., Weiss, D. J., & Wang, C. (2021). Robustness of adaptive measurement of change to item parameter estimation error. Educational and Psychological Measurement, Advance online publication. |
[5] | Curran, P. G., Kotrba, L., Denison, D. (2010, April). Careless responding in surveys: Applying traditional techniques to organizational settings. Paper presented at the 25th annual conference of the Society for Industrial/Organizational Psychology, Atlanta, GA. |
[6] |
de la Torre, J., & Deng, W. (2008). Improving person-fit assessment by correcting the ability estimate and its reference distribution. Journal of Educational Measurement, 45(2), 159-177.
doi: 10.1111/j.1745-3984.2008.00058.x URL |
[7] |
Dodd, B, G., de Ayala, R, J, & Koch, W, R. (1995). Computerized adaptive testing with polytomous items. Applied Psychological Measurement, 19(1), 5-22.
doi: 10.1177/014662169501900103 URL |
[8] |
Donlon, T. F., & Fischer, F. E. (1968). An index of an individual's agreement with group-determined item difficulties. Educational and Psychological Measurement, 28(1), 105-113.
doi: 10.1177/001316446802800110 URL |
[9] |
Doval, E., & Delicado, P. (2020). Identifying and classifying aberrant response patterns through functional data analysis. Journal of Educational and Behavioral Statistics, 45(6), 719-749.
doi: 10.3102/1076998620911941 URL |
[10] |
Drasgow, F., Levine, M. V., & Williams, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38(1), 67-86.
doi: 10.1111/j.2044-8317.1985.tb00817.x URL |
[11] |
Emons, W. H. M. (2008). Nonparametric person-fit analysis of polytomous item scores. Applied Psychological Measurement, 32(3), 224-247.
doi: 10.1177/0146621607302479 URL |
[12] |
Fung, W. K. (1993). Unmasking outliers and leverage points: A confirmation. Journal of the American Statistical Association, 88(422), 515-519.
doi: 10.1080/01621459.1993.10476302 URL |
[13] |
Glas, C. A. W., & Dagohoy, A. V. T. (2007). A person fit test for IRT models for polytomous items. Psychometrika, 72(2), 159-180.
doi: 10.1007/s11336-003-1081-5 URL |
[14] | Gulliksen, H. (1950). Theory of mental tests. John Wiley & Sons Inc. |
[15] |
Guttman, L. (1944). A basis for scaling qualitative data. American Sociological Review, 9(2), 139-150.
doi: 10.2307/2086306 URL |
[16] | Guttman, L. (1950). The basis for scalogram analysis. In S. A. Stouffer, et al. (Eds.), Measurement and prediction (pp.60-90). Princeton: Princeton University Press. |
[17] | Harris, K. M., & Udry, J. R. (2010). National Longitudinal Study of Adolescent Health (Add Health), 1994-2008: Core files [restricted use] (Technical report). Ann Arbor, MI: Inter-University Consortium for Political and Social Research. |
[18] |
Hong, M., Steedle, J. T., & Cheng, Y. (2020). Methods of detecting insufficient effort responding: Comparisons and practical recommendations. Educational and Psychological Measurement, 80(2), 312-345.
doi: 10.1177/0013164419865316 URL |
[19] | Hotaka, M. (2017). Robust latent ability estimation based on item response information and model fit (Dissertation). Milwaukee. |
[20] |
Huang, J. L., Bowling, N. A., Liu, M. Q., & Li, Y. H. (2015). Detecting insufficient effort responding with an infrequency scale: Evaluating validity and participant reactions. Journal of Business and Psychology, 30, 299-311.
doi: 10.1007/s10869-014-9357-6 URL |
[21] |
Karabatsos, G. (2003). Comparing the aberrant response detection performance of thirty-six person-fit statistics. Applied Measurement in Education, 16(4), 277-298.
doi: 10.1207/S15324818AME1604_2 URL |
[22] |
Levine, M. V., & Rubin, D. B. (1979). Measuring the appropriateness of multiple-choice test scores. Journal of Educational Statistics, 4(4), 269-290.
doi: 10.3102/10769986004004269 URL |
[23] | Li, J., & Ding, S. (2018). The several stratified methods of CAT in the presence of calibration error on GRM. Journal of Jiangxi Normal University (Natural Science), 42(4), 374-378. |
[李佳, 丁树良. (2018). 基于GRM模型的CAT分层方法在校准误差中的应用研究. 江西师范大学学报(自然科学版), 42(4), 374-378.] | |
[24] | Liu, Y., & Liu, H. Y. (2018). A comparison study for the four parameter logistic model and traditional logistic models. Psychological Exploration, 38(3), 228-235. |
[刘玥, 刘红云. (2018). 四参数Logistic模型和传统模型对被试作答拟合能力的比较研究. 心理学探新, 38(3), 228-235.] | |
[25] |
Lu, Y., & Sireci, S. G. (2007). Validity issues in test speededness. Educational Measurement: Issues and Practice, 26(4), 29-37.
doi: 10.1111/j.1745-3992.2007.00106.x URL |
[26] |
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149-174.
doi: 10.1007/BF02296272 URL |
[27] | Masters, G. N., & Wright, B. D. (1997). The partial credit model. In W. J. van der Linden (Ed.), Handbook of modern item response theory (pp.101-121). New York, NY: Springer. |
[28] |
Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. Psychological Methods, 17(3), 437-455.
doi: 10.1037/a0028085 pmid: 22506584 |
[29] |
Meijer, R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25(2), 107-135.
doi: 10.1177/01466210122031957 URL |
[30] |
Nering, M, L. ( 1995). The distribution of person fit using true and estimated person parameters. Applied Psychological Measurement, 19(2), 121-129.
doi: 10.1177/014662169501900201 URL |
[31] |
Oshima, T. C. (1994). The effect of speededness on parameter estimation in item response theory. Journal of Educational Measurement, 31(3), 200-219.
doi: 10.1111/j.1745-3984.1994.tb00443.x URL |
[32] |
Rogers, H. J., & Hattie, J. A. (1987). A Monte Carlo investigation of several person and item fit statistics for item response models. Applied Psychological Measurement, 11, 47-57
doi: 10.1177/014662168701100103 URL |
[33] | Rupp, A. A. (2013). A systematic review of the methodology for person fit research in Item Response Theory: Lessons about generalizability of inferences from the design of simulation studies. Psychological Test and Assessment Modeling, 55(1), 3-8. |
[34] | Samejima, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika, 34(4), 1-97. |
[35] | Schnipke, D. L. (1996). How contaminated by guessing are item-parameter estimates and what can be done about it? Paper presented at the annual meeting of the National Council on Measurement in Education, New York, NY. |
[36] |
Schnipke, D. L., & Scrams, D. J. (1997). Modeling item response times with a two-state mixture model: A new method of measuring speededness. Journal of Educational Measurement, 34(3), 213-232.
doi: 10.1111/j.1745-3984.1997.tb00516.x URL |
[37] |
Shao, C., Li, J., & Cheng, Y. (2016). Detection of test speededness using change-point analysis. Psychometrika, 81(4), 1118-1141.
pmid: 26305400 |
[38] |
Sinharay, S. (2016). Asymptotically correct standardization of person-fit statistics beyond dichotomous items. Psychometrika, 81(4), 992-1013.
pmid: 25953476 |
[39] |
Snijders, T. (2001). Asymptotic null distribution of person-fit statistics with estimated person parameter. Psychometrika, 66(3), 331-342.
doi: 10.1007/BF02294437 URL |
[40] |
van Der Ark, L. A. (2001). Relationships and properties of polytomous item response theory models. Applied Psychological Measurement, 25(3), 273-282.
doi: 10.1177/01466210122032073 URL |
[41] |
van Krimpen-Stoop, M. L. A., & Meijer, R. R. (2002). Detection of person misfit in computerized adaptive tests with polytomous items. Applied Psychological Measurement, 26(2), 164-180.
doi: 10.1177/01421602026002004 URL |
[42] |
Wollack, J. A., Cohen, A. S., & Wells, C. S. (2003). A method for maintaining scale stability in the presence of test speededness. Journal of Educational Measurement, 40(4), 307-330.
doi: 10.1111/j.1745-3984.2003.tb01149.x URL |
[43] | Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: MESA Press. |
[44] | Wright, B. D., & Stone, M. H. (1979). Best test design. Rasch measurement. Chicago: Mesa Press. |
[45] |
Xiong, J., Ding, S., Luo, F., & Luo, Z. (2020) Online calibration of polytomous items under the graded response model. Frontiers in Psychology, 10, 3085.
doi: 10.3389/fpsyg.2019.03085 URL |
[46] | Xiong, J., Luo, H., Wang, X., & Ding, S. (2018). The online calibration based on graded response model. Journal of Jiangxi Normal University (Natural Science), 42(1), 62-66. |
[熊建华, 罗慧, 王晓庆, 丁树良. (2018). 基于GRM的在线校准研究. 江西师范大学学报(自然科学版), 42(1), 62-66.] | |
[47] |
Yu, X., & Cheng, Y. (2019). A change-point analysis procedure based on weighted residuals to detect back random responding. Psychological Methods, 24(5), 658-674.
doi: 10.1037/met0000212 URL |
[48] |
Yuan, K. H., & Zhong, X. (2008). Outliers, leverage observations, and influential cases in factor analysis: Using robust procedures to minimize their effect. Sociological Methodology, 38(1), 329-368.
doi: 10.1111/j.1467-9531.2008.00198.x URL |
[1] | 付颜斌, 陈琦鹏, 詹沛达. 问题解决任务中行动序列的二分类建模:单/两参数行动序列模型[J]. 心理学报, 2023, 55(8): 1383-1396. |
[2] | 任赫, 陈平. 两种新的多维计算机化分类测验终止规则[J]. 心理学报, 2021, 53(9): 1044-1058. |
[3] | 罗芬, 王晓庆, 蔡艳, 涂冬波. 基于基尼指数的双目标CD-CAT选题策略[J]. 心理学报, 2020, 52(12): 1452-1465. |
[4] | 陈平. 两种新的计算机化自适应测验在线标定方法[J]. 心理学报, 2016, 48(9): 1184-1198. |
[5] | 孟祥斌;陶剑;陈莎莉. 四参数Logistic模型潜在特质参数的 Warm加权极大似然估计[J]. 心理学报, 2016, 48(8): 1047-1056. |
[6] | 汪文义; 宋丽红;丁树良. 复杂决策规则下MIRT的分类准确性和分类一致性[J]. 心理学报, 2016, 48(12): 1612-1624. |
[7] | 詹沛达;陈平;边玉芳. 使用验证性补偿多维IRT模型进行认知诊断评估[J]. 心理学报, 2016, 48(10): 1347-1356. |
[8] | 詹沛达;李晓敏;王文中;边玉芳;王立君. 多维题组效应认知诊断模型[J]. 心理学报, 2015, 47(5): 689-701. |
[9] | 姚若松;赵葆楠;刘泽;苗群鹰. 无领导小组讨论的多侧面Rasch模型应用[J]. 心理学报, 2013, 45(9): 1039-1049. |
[10] | 杜文久;周娟;李洪波. 二参数逻辑斯蒂模型项目参数的估计精度[J]. 心理学报, 2013, 45(10): 1179-1186. |
[11] | 刘红云,李冲,张平平,骆方. 分类数据测量等价性检验方法及其比较:项目阈值(难度)参数的组间差异性检验[J]. 心理学报, 2012, 44(8): 1124-1136. |
[12] | 罗芬,丁树良,王晓庆. 多级评分计算机化自适应测验动态综合选题策略[J]. 心理学报, 2012, 44(3): 400-412. |
[13] | 田伟,辛涛. 基于等级反应模型的规则空间方法[J]. 心理学报, 2012, 44(2): 249-262. |
[14] | 杜文久;肖涵敏. 多维项目反应理论等级反应模型[J]. 心理学报, 2012, 44(10): 1402-1407. |
[15] | 刘红云,骆方,王玥,张玉. 多维测验项目参数的估计:基于SEM与MIRT方法的比较[J]. 心理学报, 2012, 44(1): 121-132. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||