ISSN 0439-755X
CN 11-1911/B

Acta Psychologica Sinica ›› 2019, Vol. 51 ›› Issue (9): 1057-1067.doi: 10.3724/SP.J.1041.2019.01057

• Reports of Empirical Studies • Previous Articles     Next Articles

Make adaptive testing know examinees better: The item selection strategies based on recommender systems

WANG Pujue1,LIU Hongyun1,2()   

  1. 1. Faculty of Psychology, Beijing Normal University
    2. Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
  • Received:2018-06-10 Published:2019-09-25 Online:2019-07-24
  • Contact: Hongyun LIU


Better CAT item selection strategies may be designed by making better use of information from previous examinees’ responses. The past examinees’ data serve as a valuable reference for selecting items more accurately and evenly for new examinees. However, most of the existing strategies proposed under the theoretical framework of IRT only use information from the current examinee and fail to take full advantage of past examinees’ data. A collaborative filtering recommender approach from the recommender system literature is able to find items that best match one’s preference by utilizing information from others, which shares the similar goal as the item selection strategy of CAT. Therefore, the present study adapted the underlying assumptions of collaborative filtering recommender and proposed new item selection strategies which take advantage of past examinees’ data, and then investigated the potential factors that might affect the performance of new strategies.

In light of user-based collaborative filtering, we defined similar examinees as a group of examinees who uniformly answered the same items, and proposed two strategies, Direct Examinee-Based Recommender (DEBR) and Indirect Examinee-Based Recommender (IEBR). Two simulation studies were conducted to examine the measurement accuracy and item exposure control of new strategies under different conditions. In study 1, a simulated item bank was considered. The recommender-based strategies used two different types of past examinees’ data generated by FMI and BAS, respectively, to select items under two fixed-length CATs. In study 2, a real item bank was used to test new strategies under a more realistic setting. The effect of combining two batches of past examinees’ data from different recommender-based strategies was also investigated.

In both studies, when using past examinees’ data with high accuracy but poor item exposure control (generated by FMI), the recommender-based strategies greatly remedied unbalanced item utilization with an acceptable loss of accuracy. When using past examinees’ data with better tradeoff of measurement precision and test security (generated by BAS), the recommender-based strategies kept the accuracy at the same level and further improved item exposure control. More specifically, DEBR focused on maintaining the accuracy and had lower measurement error than IEBR; IEBR was good at improving the control of item exposure and made better use of the whole item bank than all the other strategies. These features of two recommender-based strategies were stable and consistent under different item banks and different length of CATs. The extent to which DEBR and IEBR demonstrated their features was influenced by the quality of item bank, test length, number of past examinees and strategy used to generate data.

In general, this research successfully combined the recommender systems with CAT item selection methods to establish a new flexible framework, which is an unprecedented innovation upon the traditional item selection strategies. This research also provided empirical evidence for the value of past examinees’ data and the recommender system approach as a feasible alternative option for selecting items in CAT. Finally, suggestions for future studies were provided regarding investigating the proposed new strategies in various situations and upgrading recommender-based strategies for more CAT conditions, including finding diverse measures of similarities between examinees or items and employing more complex algorithms of recommender system to meet the demands of large-scale tests.

Key words: selection strategy, past examinees’ data, recommender system, collaborative filtering recommender, simulation study

CLC Number: