ISSN 0439-755X
CN 11-1911/B

心理学报 ›› 2018, Vol. 50 ›› Issue (7): 761-770.doi: 10.3724/SP.J.1041.2018.00761

• 研究报告 • 上一篇    下一篇


孙鑫1, 黎坚1,2(), 符植煜1   

  1. 1 北京师范大学心理学部
    2 应用实验心理北京市重点实验室, 北京 100875
  • 收稿日期:2017-08-10 发布日期:2018-05-29 出版日期:2018-07-15
  • 基金资助:
    * 北京市教育科学“十二五”规划青年专项课题资助(CBA15048)

Using game log-file to predict students' reasoning ability and mathematical achievement: An application of machine learning

SUN Xin1, LI Jian1,2(), FU Zhiyu1   

  1. 1 Faculty of Psychology, Beijing Normal University
    2 Beijing Key Lab of Applied Experimental Psychology, Beijing 100875, China
  • Received:2017-08-10 Online:2018-05-29 Published:2018-07-15


以360名初中生为被试, 使用推箱子游戏, 结合游戏日志文件(log-file)和机器学习技术预测学生的推理能力和数学成绩。预测变量是从推箱子的过程数据中提取的一系列特征指标, 结果变量是瑞文推理测验成绩和数学成绩, 且均以25%为高低分组的临界值转换为二分变量。结果发现, 训练的模型预测推理能力最高能获得76.11%的查准率、65.72%的精确率、63.10%的查全率以及65.01%的F1得分; 预测数学成绩最高能获得83.07%的查准率、73.70%的精确率、73.33%的查全率以及75.57%的F1得分。研究结果说明, 机器学习建立的区分模型具有较好的预测效果, 利用log-file所记录的游戏过程数据可以对个体的能力进行有效预测。

关键词: 电子游戏, 推箱子, 机器学习, 推理能力, 数学成绩


With the development of the progress of information technology, the deficiency of traditional psychological testing is becoming more obvious, such as test anxiety and test exposure. Some researchers have begun to test individuals using game-based assessment, which has many advantages, such as increasing the motivation and input level of the participants, and providing the possibility for the implementation of log-file technology. However, the current data analysis and scoring logic ignore substantial information of process, and thus cannot accurately assess individual characteristics and abilities. The advantages of machine learning in data analysis provide a new direction. The machine learning algorithm can analyze the log-file data by building a complex model.

The present study attempted to use game-based assessment combining game log-file and machine learning techniques to predict participants’ ability: reasoning ability and mathematical achievement. Participants were 360 first and second grade students from a middle school in Beijing; predictive variables were a series of features extracted from the game log-file, outcome variables were dichotomous variables calculated from Raven test and mathematics achievement, which took 25th and 75th percentile as the cutoff line. In the model training, the random forest algorithm was selected, 70% samples were randomly selected for cross validation and hyper parametric search, and then the prediction was carried out on the other 30% of samples.

Results showed that the logarithm of the ratio of the first step time to the average execution time was the highest features of average importance ratio, and the number of steps that are different from the optimal solution, thinking time ratio, execution between fluctuation, proportion of repeat steps all contributed to the mathematical achievement prediction model; reasoning ability prediction model was similar. With these important features, it could be found that the reasoning ability prediction model had 76.11% precision, 65.72% accuracy, 63.10% recall and 65.01% F1 scores; the mathematical achievement prediction model had 83.07% precision, 73.70% accuracy, 73.33% recall and 75.57% F1 score.

The finding of the present study showed that the random forest model had acceptable predictive effect when predicting reasoning ability and mathematics achievement classification based on the game log-file, with 75% precision of reasoning and 80% precision of math. In conclusion, the research provides a new method to predict the cognitive ability and academic achievement of the students; the game log-file combined with machine learning can establish an effective discrimination model. This result can provide some reference and direction for the development of educational psychological assessment.

Key words: video game, Sokoban, machine learning, reasoning ability, mathematical achievement.