ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2019, Vol. 51 ›› Issue (9): 1057-1067.doi: 10.3724/SP.J.1041.2019.01057

• 研究报告 • 上一篇    下一篇

让自适应测验更知人善选——基于推荐系统的选题策略

王璞珏1,刘红云1,2()   

  1. 1. 北京师范大学心理学部
    2. 北京师范大学心理学部应用实验心理北京市重点实验室, 北京 100875
  • 收稿日期:2018-06-10 出版日期:2019-09-25 发布日期:2019-07-24
  • 通讯作者: 刘红云 E-mail:hyliu@bnu.edu.cn
  • 基金资助:
    * 国家自然科学基金项目(31571152);北京市与中央在京高校共建项目(019-105812);国家教育考试科研规划2017年度课题(GJK2017015)

Make adaptive testing know examinees better: The item selection strategies based on recommender systems

WANG Pujue1,LIU Hongyun1,2()   

  1. 1. Faculty of Psychology, Beijing Normal University
    2. Beijing Key Laboratory of Applied Experimental Psychology, Faculty of Psychology, Beijing Normal University, Beijing, 100875, China
  • Received:2018-06-10 Online:2019-09-25 Published:2019-07-24
  • Contact: Hongyun LIU E-mail:hyliu@bnu.edu.cn

摘要:

基于推荐系统中协同过滤推荐的思想, 提出两种可以利用已有答题者数据的CAT选题策略:直接基于答题者推荐(DEBR)和间接基于答题者推荐(IEBR)。通过两个模拟研究, 在不同题库和不同长度的测验中, 比较了两种推荐选题策略与两种传统选题策略(FMI和BAS)在测量精度和对题目曝光率控制上的表现, 以及影响推荐选题策略表现的因素。结果发现:两种推荐选题策略对题目曝光率的控制优于两种传统选题策略, 测量精度不亚于BAS方法, 其中DEBR侧重选题精度, IEBR对题目曝光率控制最好。已有答题者数据的特点和质量是影响推荐选题策略表现的主要因素。

关键词: 选题策略, 已有答题者数据, 推荐系统, 协同过滤推荐, 模拟研究

Abstract:

Better CAT item selection strategies may be designed by making better use of information from previous examinees’ responses. The past examinees’ data serve as a valuable reference for selecting items more accurately and evenly for new examinees. However, most of the existing strategies proposed under the theoretical framework of IRT only use information from the current examinee and fail to take full advantage of past examinees’ data. A collaborative filtering recommender approach from the recommender system literature is able to find items that best match one’s preference by utilizing information from others, which shares the similar goal as the item selection strategy of CAT. Therefore, the present study adapted the underlying assumptions of collaborative filtering recommender and proposed new item selection strategies which take advantage of past examinees’ data, and then investigated the potential factors that might affect the performance of new strategies.

In light of user-based collaborative filtering, we defined similar examinees as a group of examinees who uniformly answered the same items, and proposed two strategies, Direct Examinee-Based Recommender (DEBR) and Indirect Examinee-Based Recommender (IEBR). Two simulation studies were conducted to examine the measurement accuracy and item exposure control of new strategies under different conditions. In study 1, a simulated item bank was considered. The recommender-based strategies used two different types of past examinees’ data generated by FMI and BAS, respectively, to select items under two fixed-length CATs. In study 2, a real item bank was used to test new strategies under a more realistic setting. The effect of combining two batches of past examinees’ data from different recommender-based strategies was also investigated.

In both studies, when using past examinees’ data with high accuracy but poor item exposure control (generated by FMI), the recommender-based strategies greatly remedied unbalanced item utilization with an acceptable loss of accuracy. When using past examinees’ data with better tradeoff of measurement precision and test security (generated by BAS), the recommender-based strategies kept the accuracy at the same level and further improved item exposure control. More specifically, DEBR focused on maintaining the accuracy and had lower measurement error than IEBR; IEBR was good at improving the control of item exposure and made better use of the whole item bank than all the other strategies. These features of two recommender-based strategies were stable and consistent under different item banks and different length of CATs. The extent to which DEBR and IEBR demonstrated their features was influenced by the quality of item bank, test length, number of past examinees and strategy used to generate data.

In general, this research successfully combined the recommender systems with CAT item selection methods to establish a new flexible framework, which is an unprecedented innovation upon the traditional item selection strategies. This research also provided empirical evidence for the value of past examinees’ data and the recommender system approach as a feasible alternative option for selecting items in CAT. Finally, suggestions for future studies were provided regarding investigating the proposed new strategies in various situations and upgrading recommender-based strategies for more CAT conditions, including finding diverse measures of similarities between examinees or items and employing more complex algorithms of recommender system to meet the demands of large-scale tests.

Key words: selection strategy, past examinees’ data, recommender system, collaborative filtering recommender, simulation study

中图分类号: