Advances in Psychological Science ›› 2020, Vol. 28 ›› Issue (9): 1462-1477.doi: 10.3724/SP.J.1042.2020.01462
• Research Method • Previous Articles Next Articles
ZHANG Longfei, WANG Xiaowen, CAI Yan, TU Dongbo()
Received:
2019-10-12
Online:
2020-09-15
Published:
2020-07-24
Contact:
TU Dongbo
E-mail:tudongbo@aliyun.com
CLC Number:
ZHANG Longfei, WANG Xiaowen, CAI Yan, TU Dongbo. Change point analysis: A new method to detect aberrant responses in psychological and educational testing[J]. Advances in Psychological Science, 2020, 28(9): 1462-1477.
CUSUM | CPA | |
---|---|---|
主要思想 | 按照题目顺序依次将各题上观察与期望得分间的残差累积求和。 | 找到某个可将序列划分为具有不同统计学属性两部分的点。 |
PFS | 基于题目平均加权残差的单侧指标$C_{j}^{+}$, $C_{j}^{-}$和双侧指标${{C}^{T}}$。 | 双侧指标:基于似然比检验的${{L}_{\max }}$, 基于Wald检验的${{W}_{\max }}$, 基于得分检验的${{S}_{\max }}$和基于加权残差的${{R}_{\max }}$, 以及各自的单侧形式。 |
单双侧指标 | 在侦测前已明确目标效应时用单侧指标, 未明确目标效应或对目标效应不作具体要求时用双侧指标。 | |
优点 | 输出图像, 可用于过程监控。 | 自动精确定位变点。 |
缺点 | 需人工检查图像来定位变点, 准确性较低。 | 当变点位于序列最前或最后几题时难以定位。 |
适用情境 | 变点前后模型参数已知。 | 变点前后模型参数未知。其中${{L}_{\max }}$、${{W}_{\max }}$和${{S}_{\max }}$适用于高风险(教育)测验, ${{R}_{\max }}$适用于低风险(心理)测验。 |
CUSUM | CPA | |
---|---|---|
主要思想 | 按照题目顺序依次将各题上观察与期望得分间的残差累积求和。 | 找到某个可将序列划分为具有不同统计学属性两部分的点。 |
PFS | 基于题目平均加权残差的单侧指标$C_{j}^{+}$, $C_{j}^{-}$和双侧指标${{C}^{T}}$。 | 双侧指标:基于似然比检验的${{L}_{\max }}$, 基于Wald检验的${{W}_{\max }}$, 基于得分检验的${{S}_{\max }}$和基于加权残差的${{R}_{\max }}$, 以及各自的单侧形式。 |
单双侧指标 | 在侦测前已明确目标效应时用单侧指标, 未明确目标效应或对目标效应不作具体要求时用双侧指标。 | |
优点 | 输出图像, 可用于过程监控。 | 自动精确定位变点。 |
缺点 | 需人工检查图像来定位变点, 准确性较低。 | 当变点位于序列最前或最后几题时难以定位。 |
适用情境 | 变点前后模型参数已知。 | 变点前后模型参数未知。其中${{L}_{\max }}$、${{W}_{\max }}$和${{S}_{\max }}$适用于高风险(教育)测验, ${{R}_{\max }}$适用于低风险(心理)测验。 |
[1] | 陈希孺. (1991). 变点统计分析简介. 数理统计与管理, (1), 52-59. |
[2] |
Abahous, H., Ronchail, J., Sifeddine, A., Kenny, L., & Bouchaou, L. (2018). Trend and change point analyses of annual precipitation in the Souss-Massa Region in Morocco during 1932-2010. Theoretical and Applied Climatology, 134(3-4), 1153-1163.
doi: 10.1007/s00704-017-2325-0 URL |
[3] | Allen, D. E., McAleer, M., Powell, R. J., & Singh, A. K. (2018). Non-parametric multiple change point analysis of the global financial crisis. Annals of Financial Economics, 13(02), 1850008. |
[4] | American Educational Research Association, American Psychological Association, & National Council for Measurement in Education. (2014). Standards for educational and psychological testing. Washington, DC: American Educational Research Association. |
[5] |
Aminikhanghahi, S., & Cook, D. J. (2017). A survey of methods for time series change point detection. Knowledge and Information Systems, 51, 339-367.
doi: 10.1007/s10115-016-0987-z URL pmid: 28603327 |
[6] | Andrews, D. (1993). Tests for parameter instability and structural change with unknown change point. Econometrica, 61(4), 821-856. |
[7] |
Armstrong, R. D., & Shi, M. (2009). A parametric cumulative sum statistic for person fit. Applied Psychological Measurement, 33(5), 391-410.
doi: 10.1177/0146621609331961 URL |
[8] | Baker, F. B., & Kim, H. S. (2004). Item response theory: Parameter estimation techniques (2nd ed.). New York, NY: Marcel Dekker. |
[9] | Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: A practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B (Methodological), 57(1), 289-300. |
[10] |
Bolt, D. M., Cohen, A. S., & Wollack, J. A. (2002). Item parameter estimation under conditions of test speededness: Application of a mixture Rasch model with ordinal constraints. Journal of Educational Measurement, 39(4), 331-348.
doi: 10.1111/jedm.2002.39.issue-4 URL |
[11] | Bolt, D. M., Mroch, A. A., & Kim, J.-S. (2003, April). An empirical investigation of the hybrid IRT model for improving item parameter estimation in speeded tests. Paper presented at the meeting of the American Educational Research Association, Chicago, IL. |
[12] | Bradlow, E., & Weiss, R. E. (2001). Outlier measures and norming methods for computerized adaptive tests. Journal of Educational and Behavioral Statistics, 26(1), 85-104. |
[13] | Bradlow, E., Weiss, R. E., & Cho, M. (1998). Bayesian identification of outliers in computerized adaptive tests. Journal of the American Statistical Association, 93, 910-919. |
[14] | Chen, J., & Gupta, A. K. (2012). Parametric statistical change point analysis: With applications to genetics, medicine, and finance (2nd ed.). New York: Springer. |
[15] | Csorgo, M., & Horvath, L. (1997). Limit theorems in change-point analysis. New York, NY: Wiley. |
[16] | de Boeck, P., Cho, S. J., & Wilson, M. (2011). Explanatory secondary dimension modeling of latent differential item functioning. Applied Psychological Measurement, 35(8), 583-603. |
[17] | Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Erlbaum, Inc. |
[18] | Estrella, A., & Rodrigues, A. (2005). One-sided test for an unknown breakpoint: Theory, computation, and application to monetary theory (Staff Reports No. 232). Federal Reserve Bank of New York. |
[19] | Evans, F. R., & Reilly, R. R. (1972). A study of speededness as a source of test bias. Journal of Educational Measurement, 9, 123-131. |
[20] |
Fox, J. P., & Marianti, S. (2016). Joint modeling of ability and differential speed using responses and response times. Multivariate behavioral research, 51(4), 540-553.
URL pmid: 27269482 |
[21] |
Genovese, C. R., Lazar, N. A., & Nichols, T. (2002). Thresholding of statistical maps in functional neuroimaging using the false discovery rate. Neuroimage, 15(4), 870-878.
URL pmid: 11906227 |
[22] | Goegebeur, Y., de Boeck, P., Wollack, J. A., & Cohen, A. S. (2008). A speeded item response model with gradual process change. Psychometrika, 73(1), 65. |
[23] | Hawkins, D. M., Qiu, P., & Kang, C. W. (2003). The changepoint model for statistical process control. Journal of Quality Technology, 35(4), 355-366. |
[24] |
Hong, M. R., & Cheng, Y. (2019). Robust maximum marginal likelihood (RMML) estimation for item response theory models. Behavior Research Methods, 51(2), 573-588.
URL pmid: 30350024 |
[25] | Karabatsos, & George.(2003). Comparing the aberrant response detection performance of thirty-six person-fit statistics. Applied Measurement in Education, 16(4), 277-298. |
[26] |
Kass-Hout, T. A., Xu, Z., McMurray, P., Park, S., Buckeridge, D. L., Brownstein, J. S., ... Groseclose, S. L. (2012). Application of change point analysis to daily influenza-like illness emergency department visits. Journal of the American Medical Informatics Association, 19(6), 1075-1081.
URL pmid: 22759619 |
[27] | Lai, T. L. (2001). Sequential analysis: Some classical problems and new challenges. Statistica Sinica, 11(2), 303-408. |
[28] |
Lee, Y. H., & von, Davier, A., A. (2013). Monitoring scale scores over time via quality control charts, model-based approaches, and time series techniques. Psychometrika, 78(3), 557-575.
URL pmid: 25106404 |
[29] |
Li, J., Witten, D.M., Johnstone, I.M., & Tibshirani, R. (2012). Normalization, testing, and false discovery rate estimation for RNA-sequencing data. Biostatistics, 13(3), 523-538.
URL pmid: 22003245 |
[30] | Maleki, S., Bingham, C., & Zhang, Y. (2016). Development and realization of changepoint analysis for the detection of emerging faults on industrial systems. IEEE Transactions on Industrial Informatics, 12(3), 1180-1187. |
[31] | Meade, A. W. (2016). Understanding and detecting careless responding in survey research. Retrieved February 15, 2020, from https://cba.unl.edu/outreach/carma/documents/ CARMA-Meade-Presentation.pdf |
[32] | Meijer, R. R. (2002). Outlier detection in high-stakes certification testing. Journal of Educational Measurement, 39(3), 219-233. |
[33] | Mortaji, S. T. H., Noorossana, R., & Bagherpour, M. (2015). Project completion time and cost prediction using change point analysis. Journal of Management in Engineering, 31(5), 04014086. |
[34] |
Nam, C. F. H., Aston, J. A. D., & Johansen, A. M. (2012). Quantifying the uncertainty in change points. Journal of Time Series Analysis, 33(5), 807-823.
doi: 10.1111/jtsa.2012.33.issue-5 URL |
[35] | Nigro, M. B., Pakzad, S. N., & Dorvash, S. (2014). Localized structural damage detection: A change point analysis. Computer-Aided Civil and Infrastructure Engineering, 29(6), 416-432. |
[36] | Oshima, T. C. (1994). The effect of speededness on parameter estimation in item response theory. Journal of Educational Measurement, 31(3), 200-219. |
[37] | Page, E. S. (1954). Continuous inspection schemes. Biometrika, 41(1-2), 100-115. |
[38] | Patton, J. M., Cheng, Y., Hong, M. R., & Diao, Q. (2019). Detection and treatment of careless responses to improve item parameter estimation. Journal of Educational and Behavioral Statistics, 44(3), 309-341. |
[39] | Rost, J. (1990). Rasch models in latent classes: An integration of two approaches to item analysis. Applied Psychological Measurement, 14(3), 271-282. |
[40] |
Schwartzman, A., & Lin, X. (2011). The effect of correlation in false discovery rate estimation. Biometrika, 98(1), 199-214.
doi: 10.1093/biomet/asq075 URL pmid: 23049127 |
[41] | Shao, C. (2016). Aberrant response detection using change-point analysis (Unpublished Doctoral dissertation). University of Notre Dame, Notre Dame, IN. |
[42] | Shao, C., Li, J., & Cheng, Y. (2016). Detection of test speededness using change-point analysis. Psychometrica, 81(4), 1118-1141. |
[43] | Sinharay, S. (2016). Person fit analysis in computerized adaptive testing using tests for a change point. Journal of Educational and Behavioral Statistics, 41(5), 521-549. |
[44] | Sinharay, S. (2017a). Detection of item preknowledge using likelihood ratio test and score test. Journal of Educational and Behavioral Statistics, 42(1), 46-68. |
[45] |
Sinharay, S. (2017b). Some remarks on applications of tests for detecting a change point to psychometric problems. Psychometrika, 82(4), 1149-1161.
URL pmid: 27770307 |
[46] |
Sinharay, S. (2017c). Which statistic should be used to detect item preknowledge when the set of compromised items is known?. Applied Psychological Measurement, 41(6), 403-421.
doi: 10.1177/0146621617698453 URL pmid: 29881099 |
[47] | Storey, J. D., & Tibshirani, R. (2003). Statistical significance for genomewide studies. Proceedings of the National Academy of Sciences, 100(16), 9440-9445. |
[48] | Suh, Y., Cho, S. J., & Wollack, J. A. (2012). A comparison of item calibration procedures in the presence of test speededness. Journal of Educational Measurement, 49(3), 285-311. |
[49] | Suhaila, J., & Yusop, Z. (2018). Trend analysis and change point detection of annual and seasonal temperature series in Peninsular Malaysia. Meteorology and Atmospheric Physics, 130(5), 565-581. |
[50] | Tendeiro, J. N., & Meijer, R. R. (2012). A CUSUM to detect person misfit: A discussion and some alternatives for existing procedures. Applied Psychological Measurement, 36(5), 420-442. |
[51] |
Tendeiro, J. N., & Meijer, R. R. (2014). Detection of invalid test scores: The usefulness of simple nonparametric statistics. Journal of Educational Measurement, 51(3), 239-259.
doi: 10.1111/jedm.2014.51.issue-3 URL |
[52] | Thies, S., & Molnár, P. (2018). Bayesian change point analysis of Bitcoin returns. Finance Research Letters, 27, 223-227. |
[53] | United States Department of Education. (2013). Testing integrity: Issues and recommendations for best practice. Retrieved November 21, 2019, from http://nces.ed.gov/ pubs2013/2013454.pdf. |
[54] | van der, Linden, W., J. (2007). A hierarchical framework for modeling speed and accuracy on test items. Psychometrika, 72(3), 287-308. |
[55] | van Krimpen-Stoop, E. M. L. A., Meijer, R. R. (2000). Detecting person misfit in adaptive testing using statistical process control techniques. In W. J. van der Linden & G. A. Glas (Eds.), Computerized Adaptive Testing: Theory and Practice (pp. 201-219). Dordrecht, Netherlands: Springer. |
[56] | van, Krimpen-Stoop, E. M. L., A., & Meijer, R. R. (2001). CUSUM-based person-fit statistics for adaptive testing. Journal of Educational and Behavioral Statistics, 26(2), 199-217. |
[57] | van, Krimpen-Stoop, E. M. L., A., & Meijer, R. R. (2002). Detection of person misfit in computerized adaptive tests with polytomous items. Applied Psychological Measurement, 26(2), 164-180. |
[58] | Vostrikova, L. Y. (1981). Detecting “disorder” in multidimensional random processes. Doklady Akademii Nauk, 259(2), 270-274. |
[59] |
Wang, C., & Xu, G. (2015). A mixture hierarchical model for response times and response accuracy. British Journal of Mathematical and Statistical Psychology, 68(3), 456-477.
URL pmid: 25873487 |
[60] |
Wang, T., & Hanson, B. A. (2005). Development and calibration of an item response model that incorporates response time. Applied Psychological Measurement, 29(5), 323-339.
doi: 10.1177/0146621605275984 URL |
[61] | Wollack, J. A., & Cohen, A. S. (2004, April). A model for simulating speeded test data. Paper presented at the meeting of the American Educational Research Association. San Diego, CA. |
[62] | Worsley, K. J. (1979). On the likelihood ratio test for a shift in location of normal populations. Journal of the American Statistical Association, 74, 365-367. |
[63] | Yamamoto, K., & Everson, H. (1997). Modeling the effects of test length and test time on parameter estimation using the HYBRID model. In J. Rost & R. Langeheine (Eds.), Applications of latent trait and latent class models in the social sciences (pp. 89-98). New York: Waxmann. |
[64] |
Ye, W., Liu, X., & Miao, B. (2012). Measuring the subprime crisis contagion: Evidence of change point analysis of copula functions. European Journal of Operational Research, 222(1), 96-103.
doi: 10.1016/j.ejor.2012.04.004 URL |
[65] | Yu, M., & Ruggieri, E. (2019). Change point analysis of global temperature records. International Journal of Climatology, 39(8), 3679-3688. |
[66] |
Yu, X., & Cheng, Y. (2019). A change-point analysis procedure based on weighted residuals to detect back random responding. Psychological Methods, 24(5), 658-674.
doi: 10.1037/met0000212 URL pmid: 30762378 |
[67] | Zhang, J. (2014). A sequential procedure for detecting compromised items in the item pool of a CAT system. Applied Psychological Measurement, 38(2), 87-104. |
[1] | LIU Yue, LIU Hongyun. Mixture Model Method: A new method to handle aberrant responses in psychological and educational testing [J]. Advances in Psychological Science, 2021, 29(9): 1696-1710. |
[2] | Han Dan;Guo Qingke;Wang Zhao;Chen Xuexia. Statistical Methods for Detecting Answer Copying [J]. , 2008, 16(1): 175-183. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||