Please wait a minute...
心理学报  2018, Vol. 50 Issue (7): 761-770    DOI: 10.3724/SP.J.1041.2018.00761
     研究报告 本期目录 | 过刊浏览 | 高级检索 |
利用游戏log-file预测学生推理能力和数学成绩——机器学习的应用
孙鑫1,黎坚1,2(),符植煜1
1 北京师范大学心理学部
2 应用实验心理北京市重点实验室, 北京 100875
Using game log-file to predict students' reasoning ability and mathematical achievement: An application of machine learning
Xin SUN1,Jian LI1,2(),Zhiyu FU1
1 Faculty of Psychology, Beijing Normal University
2 Beijing Key Lab of Applied Experimental Psychology, Beijing 100875, China
全文: PDF(536 KB)   HTML 评审附件 (1 KB) 
输出: BibTeX | EndNote (RIS)       背景资料
文章导读  
摘要 

以360名初中生为被试, 使用推箱子游戏, 结合游戏日志文件(log-file)和机器学习技术预测学生的推理能力和数学成绩。预测变量是从推箱子的过程数据中提取的一系列特征指标, 结果变量是瑞文推理测验成绩和数学成绩, 且均以25%为高低分组的临界值转换为二分变量。结果发现, 训练的模型预测推理能力最高能获得76.11%的查准率、65.72%的精确率、63.10%的查全率以及65.01%的F1得分; 预测数学成绩最高能获得83.07%的查准率、73.70%的精确率、73.33%的查全率以及75.57%的F1得分。研究结果说明, 机器学习建立的区分模型具有较好的预测效果, 利用log-file所记录的游戏过程数据可以对个体的能力进行有效预测。

服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
孙鑫
黎坚
符植煜
关键词 电子游戏推箱子机器学习推理能力数学成绩    
Abstract

With the development of the progress of information technology, the deficiency of traditional psychological testing is becoming more obvious, such as test anxiety and test exposure. Some researchers have begun to test individuals using game-based assessment, which has many advantages, such as increasing the motivation and input level of the participants, and providing the possibility for the implementation of log-file technology. However, the current data analysis and scoring logic ignore substantial information of process, and thus cannot accurately assess individual characteristics and abilities. The advantages of machine learning in data analysis provide a new direction. The machine learning algorithm can analyze the log-file data by building a complex model.

The present study attempted to use game-based assessment combining game log-file and machine learning techniques to predict participants’ ability: reasoning ability and mathematical achievement. Participants were 360 first and second grade students from a middle school in Beijing; predictive variables were a series of features extracted from the game log-file, outcome variables were dichotomous variables calculated from Raven test and mathematics achievement, which took 25th and 75th percentile as the cutoff line. In the model training, the random forest algorithm was selected, 70% samples were randomly selected for cross validation and hyper parametric search, and then the prediction was carried out on the other 30% of samples.

Results showed that the logarithm of the ratio of the first step time to the average execution time was the highest features of average importance ratio, and the number of steps that are different from the optimal solution, thinking time ratio, execution between fluctuation, proportion of repeat steps all contributed to the mathematical achievement prediction model; reasoning ability prediction model was similar. With these important features, it could be found that the reasoning ability prediction model had 76.11% precision, 65.72% accuracy, 63.10% recall and 65.01% F1 scores; the mathematical achievement prediction model had 83.07% precision, 73.70% accuracy, 73.33% recall and 75.57% F1 score.

The finding of the present study showed that the random forest model had acceptable predictive effect when predicting reasoning ability and mathematics achievement classification based on the game log-file, with 75% precision of reasoning and 80% precision of math. In conclusion, the research provides a new method to predict the cognitive ability and academic achievement of the students; the game log-file combined with machine learning can establish an effective discrimination model. This result can provide some reference and direction for the development of educational psychological assessment.

Key wordsvideo game    Sokoban    machine learning    reasoning ability    mathematical achievement.
收稿日期: 2017-08-10      出版日期: 2018-05-29
ZTFLH:  B849: G44  
基金资助:* 北京市教育科学“十二五”规划青年专项课题资助(CBA15048)
引用本文:   
孙鑫,黎坚,符植煜. 利用游戏log-file预测学生推理能力和数学成绩——机器学习的应用[J]. 心理学报, 2018, 50(7): 761-770.
Xin SUN,Jian LI,Zhiyu FU. Using game log-file to predict students' reasoning ability and mathematical achievement: An application of machine learning. Acta Psychologica Sinica, 2018, 50(7): 761-770.
链接本文:  
http://journal.psych.ac.cn/xlxb/CN/10.3724/SP.J.1041.2018.00761      或      http://journal.psych.ac.cn/xlxb/CN/Y2018/V50/I7/761
  推箱子游戏界面截图
  一个典型的行动过程
特征 平均值 标准差 最小值 最大值
失败组
第一步用时/平均执行时间 22.71 24.26 2.52 198.34
ln (第一步用时/平均执行时间) 2.31 0.82 0.81 4.97
完成箱子的比例 0.33 0.08 0.00 0.57
第一步用时/总时间 0.22 0.12 0.04 0.76
ln (第一步用时/总时间) -1.92 0.60 -3.31 -0.29
思考步数占比 -2.39 0.23 -3.04 -1.69
平均执行时间 0.64 0.15 0.37 1.33
执行间波动 2.15 1.20 0.35 10.52
重复步数占比 0.07 0.03 0.00 0.20
与最优步数相差 -5.75 9.45 -23.36 65.78
与最优路径重合步数占比 0.17 0.04 0.04 0.32
成功组
第一步用时/平均执行时间 24.36 23.81 2.65 168.97
ln (第一步用时/平均执行时间) 2.49 0.78 0.92 4.95
第一步用时/总时间 0.25 0.14 0.04 0.77
ln (第一步用时/总时间) -1.77 0.61 -3.18 -0.27
思考步数占比 -2.61 0.27 -3.53 -1.64
平均执行时间 0.48 0.11 0.33 1.18
执行间波动 1.17 0.76 0.20 5.43
重复步数占比 0.03 0.02 0.00 0.16
与最优步数相差 7.65 5.45 0.00 52.67
与最优路径重合步数占比 0.71 0.14 0.17 1.06
  特征的描述统计结果
表现类型 预测为阳性 预测为阴性
实际为阳性 TP FN
实际为阴性 FP TN
  分类表现评估表
  数学成绩预测模型中平均重要性排列前十位的特征
最优化目标 F1 查准率 查全率 精确率
推理能力
F1优先 68.83% 74.40% 61.19% 63.46%
查准率优先 63.72% 75.51% 59.17% 65.03%
查全率优先 65.01% 74.91% 63.10% 64.21%
精确率优先 64.22% 76.11% 59.05% 65.72%
数学成绩
F1优先 71.14% 79.35% 71.11% 68.02%
查准率优先 75.57% 83.07% 73.33% 73.70%
查全率优先 73.09% 81.06% 71.78% 70.62%
精确率优先 71.65% 80.19% 69.67% 69.44%
  模型预测结果
[1] Baumert A., Schlösser T., & Schmitt M . ( 2014). Economic games: A performance-based assessment of fairness and altruism. European Journal of Psychological Assessment, 30( 3), 178-192.
[2] Berg, W.K., &Byrd D.L . ( 2002). The Tower of London spatial problem-solving task: Enhancing clinical and research implementation. Journal of Clinical and Experimental Neuropsychology, 24( 5), 586-604.
pmid: 12187443
[3] Bors, D.A., &Vigneau F. , ( 2003). The effect of practice on Raven's Advanced Progressive Matrices. Learning and Individual Differences, 13( 4), 291-312.
[4] Breiman, L. ( 2001). Random forests. Machine Learning, 45( 1), 5-32.
[5] Cassady, J.C., &Johnson R.E . ( 2002). Cognitive test anxiety and academic performance. Contemporary Educational Psychology, 27( 2), 270-295.
[6] Csapó B., Ainley J., Bennett R. E., Latour T., & Law N . ( 2012). Technological issues for computer-based assessment. In P. Griffin, B. McGaw, & E. Care (Eds.), Assessment and teaching of 21st century skills( pp. 143-230). Dordrecht: Springer.
[7] DiCerbo, K.E ., & Behrens, J. T .( 2012). Implications of the digital ocean on current and future assessment. In R. W. Lissitz & H. Jiao (Eds.), Computers and their impact on state assessments: Recent history and predictions for the future (pp. 143-306). Charlotte, NC: Information Age Publishing.
[8] Di Giunta L., Alessandri G., Gerbino M., Kanacri P. L., Zuffiano A., & Caprara G. V . ( 2013). The determinants of scholastic achievement: The contribution of personality traits, self-esteem, and academic self-efficacy. Learning and Individual Differences, 27, 102-108.
[9] Duncan G. J., Dowsett C. J., Claessens A., Magnuson K., Huston A. C., Klebanov P., .. Japel C . ( 2007). School readiness and later achievement. Developmental Psychology, 43( 6), 1428-1446.
pmid: 18020822
[10] Greiff S., Wüstenberg S., & Avvisati F . ( 2015). Computer-generated log-file analyses as a window into students' minds? A showcase study based on the PISA 2012 assessment of problem solving. Computers & Education, 91, 92-105.
[11] Harrington, P . ( 2013). Machine learning in action (R. Li, P. Li, Y. D. Qu, & B. Wang, Trans.). Beijing, China: Posts & Telecom Press.
[ Harrington,P. ( 2013). 机器学习实战 (李锐, 李鹏, 曲亚东, 王斌译). 北京: 人民邮电出版社.]
[12] Hausknecht J. P., Halpert J. A., Di Paolo N. T., & Moriarty Gerrard, M. O. ( 2007). Retesting in selection: A meta- analysis of coaching and practice effects for tests of cognitive ability. Journal of Applied Psychology, 92( 2), 373-385.
pmid: 17371085
[13] Heinzen T. E., Landrum R. E., Gurung R. A.R., & Dunn, D. S. ( 2015). Game-based assessment:The mash-up we've been waiting for. In T. Reiners & L. C. Wood (Eds.), Gamification in education and business (pp. 201-217). Switzerland: Springer International Publishing.
[14] Hembree, R ( 1988). Correlates, causes, effects, and treatment of test anxiety. Review of Educational Research, 58( 1), 47-77.
[15] Ikeda M., Iwanaga M., & Seiwa H . ( 1996). Test anxiety and working memory system. Perceptual and Motor Skills, 82( 3), 1223-1231.
pmid: 8823887
[16] Judd L. L., Schettler P. J., & Rush A. J . ( 2016). A brief clinical tool to estimate individual patients’ risk of depressive relapse following remission: Proof of concept. American Journal of Psychiatry, 173( 11), 1140-1146.
pmid: 27418380
[17] Keogh, E ., &French C.C . ( 2001). Test anxiety, evaluative stress, and susceptibility to distraction from threat. European Journal of Personality, 15( 2), 123-141.
[18] Kinnunen, R., &Vauras M. , ( 1995). Comprehension monitoring and the level of comprehension in high-and low-achieving primary school children's reading. Learning and Instruction, 5( 2), 143-165.
[19] Köstering L., Schmidt C. S. M., Egger K., Amtage F., Peter J., Klöppel S., ..Kaller C. P . ( 2015). Assessment of planning performance in clinical samples: Reliability and validity of the Tower of London task (TOL-F). Neuropsychologia, 75, 646-655.
pmid: 26197091
[20] Li J., Zhang B., Du H., Zhu Z., & Li Y. M . ( 2015). Metacognitive planning: Development and validation of an online measure. Psychological Assessment, 27( 1), 260-271.
pmid: 25222433
[21] Moharil B., Gokhale C., Ghadge V., Tambvekar P., Pundlik S., & Rai G . ( 2014). Real time generalized log file management and analysis using pattern matching and dynamic clustering. International Journal of Computer Applications, 91( 16), 1-6.
[22] Neisser, U. ( 1997). Rising scores on intelligence tests: Test scores are certainly going up all over the world, but whether intelligence itself has risen remains controversial. American Scientist, 85( 5), 440-447.
[23] Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., .. Duchesnay é . ( 2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12, 2825-2830.
[24] Pressley, M., &Afflerbach P. , ( 1995). Verbal protocols of reading: The nature of constructively responsive reading. Hillsdale, N.J.: Erlbaum.
[25] Raven, J. ( 1989). The raven progressive matrices: A review of national norming studies and ethnic and socioeconomic variation within the united-states. Journal of Educational Measurement, 26( 1), 1-16.
[26] Schmidt, F.L . ( 2002). The role of general cognitive ability and job performance: Why there cannot be a debate. Human Performance, 15( 1-2), 187-210.
[27] Sonnleitner P., Brunner M., Greiff S., Funke J., Keller U., Martin R., .. Latour T . ( 2012). The Genetics Lab: Acceptance and psychometric characteristics of a computer- based microworld assessing complex problem solving. Psychological Test and Assessment Modeling, 54( 1), 54-72.
[28] Tan P. N., Steinbach M., & Kumar V . ( 2006). Introduction to data mining . India:Pearson Education.
[29] Tenorio Delgado M., Arango Uribe P., Aparicio Alonso A., & Rosas Díaz R . ( 2016). TENI: A comprehensive battery for cognitive assessment based on games and technology. Child Neuropsychology, 22( 3), 276-291.
pmid: 25396766
[30] Veenman M. V. J., Wilhelm P., & Beishuizen J. J . ( 2004). The relation between intellectual and metacognitive skills from a developmental perspective. Learning and Instruction, 14( 1), 89-109.
[31] Veenman M. V. J., Bavelaar L., De Wolf L., &van Haaren, M. G. P. ( 2014). The on-line assessment of metacognitive skills in a computerized learning environment. Learning and Individual Differences, 29, 123-130.
[32] Ventura, M., &Shute V ., ( 2013). The validity of a game-based assessment of persistence. Computers in Human Behavior, 29( 6), 2568-2572.
[33] Wu Y. Y., Kosinski M., & Stillwell D . ( 2015). Computer- based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences of the United States of America, 112( 4), 1036-1040.
pmid: 25583507
[34] Zhang B., Li J., Xu C., & Li Y. M . ( 2014). The developmental differences of problem solving ability between intellectually- gifted and intellectually-average children aged from 11-14 years old. Acta Psychologica Sinica, 46, 1823-1834.
[ 张博, 黎坚, 徐楚, 李一茗 . ( 2014). 11~14岁超常儿童与普通儿童问题解决能力的发展比较. 心理学报, 46, 1823-1834.]
[35] Zhang Z., Song Y. F., Cui L. Q., Liu X. Q., & Zhu T. S . ( 2016). Emotion recognition based on customized smart bracelet with built-in accelerometer. PeerJ, 4, e2258.
pmid: 27547564
[1] 张博;黎坚;徐楚;李一茗. 11~14岁超常儿童与普通儿童问题解决能力的发展比较[J]. 心理学报, 2014, 46(12): 1823-1834.
[2] 郭晓丽,江光荣,朱旭. 暴力电子游戏的短期脱敏效应:两种接触方式比较[J]. 心理学报, 2009, 41(03): 259-266.
[3] 孙长华,吴振云,吴志平. 瑞文作业的年龄差异及其与“位置法”记忆训练的关系[J]. 心理学报, 1994, 26(01): 59-63.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《心理学报》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发  技术支持:support@magtech.com.cn