With the development of the progress of information technology, the deficiency of traditional psychological testing is becoming more obvious, such as test anxiety and test exposure. Some researchers have begun to test individuals using game-based assessment, which has many advantages, such as increasing the motivation and input level of the participants, and providing the possibility for the implementation of log-file technology. However, the current data analysis and scoring logic ignore substantial information of process, and thus cannot accurately assess individual characteristics and abilities. The advantages of machine learning in data analysis provide a new direction. The machine learning algorithm can analyze the log-file data by building a complex model.
The present study attempted to use game-based assessment combining game log-file and machine learning techniques to predict participants’ ability: reasoning ability and mathematical achievement. Participants were 360 first and second grade students from a middle school in Beijing; predictive variables were a series of features extracted from the game log-file, outcome variables were dichotomous variables calculated from Raven test and mathematics achievement, which took 25th and 75th percentile as the cutoff line. In the model training, the random forest algorithm was selected, 70% samples were randomly selected for cross validation and hyper parametric search, and then the prediction was carried out on the other 30% of samples.
Results showed that the logarithm of the ratio of the first step time to the average execution time was the highest features of average importance ratio, and the number of steps that are different from the optimal solution, thinking time ratio, execution between fluctuation, proportion of repeat steps all contributed to the mathematical achievement prediction model; reasoning ability prediction model was similar. With these important features, it could be found that the reasoning ability prediction model had 76.11% precision, 65.72% accuracy, 63.10% recall and 65.01% F1 scores; the mathematical achievement prediction model had 83.07% precision, 73.70% accuracy, 73.33% recall and 75.57% F1 score.
The finding of the present study showed that the random forest model had acceptable predictive effect when predicting reasoning ability and mathematics achievement classification based on the game log-file, with 75% precision of reasoning and 80% precision of math. In conclusion, the research provides a new method to predict the cognitive ability and academic achievement of the students; the game log-file combined with machine learning can establish an effective discrimination model. This result can provide some reference and direction for the development of educational psychological assessment.
Baumert A., Schlösser T., & Schmitt M . ( 2014). Economic games: A performance-based assessment of fairness and altruism. European Journal of Psychological Assessment, 30( 3), 178-192.
Berg, W.K., &Byrd D.L . ( 2002). The Tower of London spatial problem-solving task: Enhancing clinical and research implementation. Journal of Clinical and Experimental Neuropsychology, 24( 5), 586-604.
Bors, D.A., &Vigneau F. , ( 2003). The effect of practice on Raven's Advanced Progressive Matrices. Learning and Individual Differences, 13( 4), 291-312.
Breiman, L. ( 2001). Random forests. Machine Learning, 45( 1), 5-32.
Cassady, J.C., &Johnson R.E . ( 2002). Cognitive test anxiety and academic performance. Contemporary Educational Psychology, 27( 2), 270-295.
Csapó B., Ainley J., Bennett R. E., Latour T., & Law N . ( 2012). Technological issues for computer-based assessment. In P. Griffin, B. McGaw, & E. Care (Eds.), Assessment and teaching of 21st century skills( pp. 143-230). Dordrecht: Springer.
DiCerbo, K.E ., & Behrens, J. T .( 2012). Implications of the digital ocean on current and future assessment. In R. W. Lissitz & H. Jiao (Eds.), Computers and their impact on state assessments: Recent history and predictions for the future (pp. 143-306). Charlotte, NC: Information Age Publishing.
Di Giunta L., Alessandri G., Gerbino M., Kanacri P. L., Zuffiano A., & Caprara G. V . ( 2013). The determinants of scholastic achievement: The contribution of personality traits, self-esteem, and academic self-efficacy. Learning and Individual Differences, 27, 102-108.
Duncan G. J., Dowsett C. J., Claessens A., Magnuson K., Huston A. C., Klebanov P., .. Japel C . ( 2007). School readiness and later achievement. Developmental Psychology, 43( 6), 1428-1446.
Greiff S., Wüstenberg S., & Avvisati F . ( 2015). Computer-generated log-file analyses as a window into students' minds? A showcase study based on the PISA 2012 assessment of problem solving. Computers & Education, 91, 92-105.
Harrington, P . ( 2013). Machine learning in action (R. Li, P. Li, Y. D. Qu, & B. Wang, Trans.). Beijing, China: Posts & Telecom Press.
Hausknecht J. P., Halpert J. A., Di Paolo N. T., & Moriarty Gerrard, M. O. ( 2007). Retesting in selection: A meta- analysis of coaching and practice effects for tests of cognitive ability. Journal of Applied Psychology, 92( 2), 373-385.
Heinzen T. E., Landrum R. E., Gurung R. A.R., & Dunn, D. S. ( 2015). Game-based assessment:The mash-up we've been waiting for. In T. Reiners & L. C. Wood (Eds.), Gamification in education and business (pp. 201-217). Switzerland: Springer International Publishing.
Hembree, R ( 1988). Correlates, causes, effects, and treatment of test anxiety. Review of Educational Research, 58( 1), 47-77.
Ikeda M., Iwanaga M., & Seiwa H . ( 1996). Test anxiety and working memory system. Perceptual and Motor Skills, 82( 3), 1223-1231.
Judd L. L., Schettler P. J., & Rush A. J . ( 2016). A brief clinical tool to estimate individual patients’ risk of depressive relapse following remission: Proof of concept. American Journal of Psychiatry, 173( 11), 1140-1146.
Keogh, E ., &French C.C . ( 2001). Test anxiety, evaluative stress, and susceptibility to distraction from threat. European Journal of Personality, 15( 2), 123-141.
Kinnunen, R., &Vauras M. , ( 1995). Comprehension monitoring and the level of comprehension in high-and low-achieving primary school children's reading. Learning and Instruction, 5( 2), 143-165.
Köstering L., Schmidt C. S. M., Egger K., Amtage F., Peter J., Klöppel S., ..Kaller C. P . ( 2015). Assessment of planning performance in clinical samples: Reliability and validity of the Tower of London task (TOL-F). Neuropsychologia, 75, 646-655.
Li J., Zhang B., Du H., Zhu Z., & Li Y. M . ( 2015). Metacognitive planning: Development and validation of an online measure. Psychological Assessment, 27( 1), 260-271.
Moharil B., Gokhale C., Ghadge V., Tambvekar P., Pundlik S., & Rai G . ( 2014). Real time generalized log file management and analysis using pattern matching and dynamic clustering. International Journal of Computer Applications, 91( 16), 1-6.
Neisser, U. ( 1997). Rising scores on intelligence tests: Test scores are certainly going up all over the world, but whether intelligence itself has risen remains controversial. American Scientist, 85( 5), 440-447.
Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., .. Duchesnay é . ( 2011). Scikit-learn: Machine learning in python. Journal of Machine Learning Research, 12, 2825-2830.
Pressley, M., &Afflerbach P. , ( 1995). Verbal protocols of reading: The nature of constructively responsive reading. Hillsdale, N.J.: Erlbaum.
Raven, J. ( 1989). The raven progressive matrices: A review of national norming studies and ethnic and socioeconomic variation within the united-states. Journal of Educational Measurement, 26( 1), 1-16.
Schmidt, F.L . ( 2002). The role of general cognitive ability and job performance: Why there cannot be a debate. Human Performance, 15( 1-2), 187-210.
Sonnleitner P., Brunner M., Greiff S., Funke J., Keller U., Martin R., .. Latour T . ( 2012). The Genetics Lab: Acceptance and psychometric characteristics of a computer- based microworld assessing complex problem solving. Psychological Test and Assessment Modeling, 54( 1), 54-72.
Tan P. N., Steinbach M., & Kumar V . ( 2006). Introduction to data mining . India:Pearson Education.
Tenorio Delgado M., Arango Uribe P., Aparicio Alonso A., & Rosas Díaz R . ( 2016). TENI: A comprehensive battery for cognitive assessment based on games and technology. Child Neuropsychology, 22( 3), 276-291.
Veenman M. V. J., Wilhelm P., & Beishuizen J. J . ( 2004). The relation between intellectual and metacognitive skills from a developmental perspective. Learning and Instruction, 14( 1), 89-109.
Veenman M. V. J., Bavelaar L., De Wolf L., &van Haaren, M. G. P. ( 2014). The on-line assessment of metacognitive skills in a computerized learning environment. Learning and Individual Differences, 29, 123-130.
Ventura, M., &Shute V ., ( 2013). The validity of a game-based assessment of persistence. Computers in Human Behavior, 29( 6), 2568-2572.
Wu Y. Y., Kosinski M., & Stillwell D . ( 2015). Computer- based personality judgments are more accurate than those made by humans. Proceedings of the National Academy of Sciences of the United States of America, 112( 4), 1036-1040.
Zhang B., Li J., Xu C., & Li Y. M . ( 2014). The developmental differences of problem solving ability between intellectually- gifted and intellectually-average children aged from 11-14 years old. Acta Psychologica Sinica, 46, 1823-1834.