[1] |
Arthur, W., Edwards, B. D., & Barrett, G. V. (2002). Multiple- choice and constructed response tests of ability: Race-based subgroup performance differences on alternative paper- and-pencil test formats. Personnel Psychology, 55(4), 985-1008.
|
[2] |
Bacon, D. R. (2003). Assessing learning outcomes: A comparison of multiple-choice and short-answer questions in a marketing context. Jounal of Marketing Education, 25(1), 31-36.
|
[3] |
Basu, T., & Murthy, C. A. (2013, December). Effective text classification by a supervised feature selection approach. IEEE 12th International Conference on Data Mining Workshops (ICDM) (pp. 918-925). Brussels, Belgium.
|
[4] |
Burrows, S., Gurevych, I., & Stein, B. (2015). The eras and trends of automatic short answer grading. International Journal of Artificial Intelligence in Education, 25(1), 60-117.
|
[5] |
Burrus, J., Betancourt, A., Holtzman, S., Minsky, J., MacCann, C., & Roberts, R. D. (2012). Emotional intelligence relates to well-being: Evidence from the situational judgment test of emotional management. Applied Psychology: Health and Well-Being, 4(2), 151-166.
|
[6] |
Cucina, J. M., Su, C., Busciglio, H. H., Thomas, P. H., & Peyton, S. T. (2015). Video-based testing: A high-fidelity job simulation that demonstrates reliability, validity, and utility. International Journal of Selection and Assessment, 23(3), 197-209.
|
[7] |
Downer, K., Wells, C., & Crichton, C. (2019). All work and no play: A text analysis. International Journal of Market Research, 61(3), 236-251.
|
[8] |
Edwards, B. D., & Arthur, W.,Jr. (2007). An examination of factors contributing to a reduction in subgroup differences on a constructed-response paper-and-pencil test of scholastic achievement. Journal of Applied Psychology, 92(3), 794-801.
pmid: 17484558
|
[9] |
Finch, W. H., Finch, M. E. H., Mcintosh, C. E., & Braun, C. (2018). The use of topic modeling with latent dirichlet analysis with open-ended survey items. Translational Issues in Psychological Science, 4(4), 403-424.
|
[10] |
Funke, U., & Schuler, H. (1998). Validity of stimulus and response components in a video test of social competence. International Journal of Selection and Assessment, 6(2), 115-123.
|
[11] |
Gu, H. L., & Wen, Z. L. (2017). Reporting and interpreting multidimensional test scores: A bi-factor perspective. Psychological Development and Education, 33(4), 504-512.
|
|
[顾红磊, 温忠麟. (2017). 多维测验分数的报告与解释: 基于双因子模型的视角. 心理发展与教育, 33(4), 504-512.]
|
[12] |
Guo, F., Gallagher, C. M., Sun, T., Tavoosi, S., & Min, H. (2021). Smarter people analytics with organizational text data: Demonstrations using classic and advanced NLP models. Human Resource Management Journal. https://doi.org/10.1111/1748-8583.12426
|
[13] |
Iliev, R., Dehghani, M., & Sagi, E. (2015). Automated text analysis in psychology: Methods, applications, and future developments. Language and Cognition, 7(2), 265-290.
|
[14] |
Kastner, M., & Stangla, B. (2011). Multiple choice and constructed response tests: Do test format and scoring matter? Procedia-Social and Behavioral Sciences. 12, 263-273.
|
[15] |
Kim, Y. (2014). Convolutional neural networks for sentence classification. Proceedings of the 19th Conference on Empirical Methods in Natural Language Processing, 1746-1751.
|
[16] |
Kjell, O. E., Kjell, K., Garcia, D., & Sikstrom, S. (2019). Semantic measures: Using natural language processing to measure, differentiate, and describe psychological constructs. Psychological Methods, 24(1), 92-115.
doi: 10.1037/met0000191
pmid: 29963879
|
[17] |
Lai, S., Xu, L., Liu, K., & Zhao, J. (2015). Recurrent convolutional neural networks for text classification. Proceedings of the 29th AAAI Conference on Artificial Intelligence, 2267-2273.
|
[18] |
Lee, B. C., & Kim, B. Y. (2021). Development of an AI-based interview system for remote hiring. International Journal of Advanced Research in Engineering and Technology, 12(3), 654-663.
|
[19] |
Lievens, F., de Corte, W., & Westerveld, L. (2015). Understanding the building blocks of selection procedures: Effects of response fidelity on performance and validity. Journal of Management, 41(6), 1604-1627.
|
[20] |
Lievens, F., Sackett, P. R., Dahlke, J. A., Oostrom, J. K., & de Soete, B. (2019). Constructed response formats and their effects on minority-majority differences and validity. Journal of Applied Psychology, 104(5), 715-726.
doi: 10.1037/apl0000367
pmid: 30431296
|
[21] |
Ling, C. (2020). Development of Classroom Observation Scale to Promote the Professional Development of New Teachers (Unpublished master's thesis). Beijing Normal University.
|
|
[凌晨. (2020). 课堂观察量表的开发——促进初任教师专业发展 (硕士学位论文). 北京师范大学.]
|
[22] |
Lubis, F. F., Mutaqin, Putri, A., Waskita, D., Sulistyaningtyas, T., Arman, A. A., & Rosmansyah, Y. (2021). Automated short-answer grading using semantic similarity based on word embedding. International Journal of Technology. 12(3), 571-581.
|
[23] |
Marentette, B. J., Meyers, L. S., Hurtz, G. M., & Kuang, D. C. (2012). Order effects on situational judgment test items: A case of construct-irrelevant difficulty. International Journal of Selection and Assessment, 20(3), 319-332.
|
[24] |
McDaniel, M. A., Hartman, N. S., Whetzel, D. L., & Grubb, W. L. (2007). Situational judgment tests, response instructions, and validity: A meta‐analysis. Personnel Psychology, 60(1), 63-91.
|
[25] |
McDaniel, M. A., Morgeson, F. P., Finnegan, E. B., Campion, M. A., & Braverman, E. P. (2001). Use of situational judgment tests to predict job performance: A clarification of the literature. Journal of Applied Psychology, 86(4), 730-740.
pmid: 11519656
|
[26] |
McDaniel, M. A., Psotka, J., Legree, P. J., Yost, A. P., & Weekley, J. A. (2011). Toward an understanding of situational judgment item validity and group differences. Journal of Applied Psychology, 96(2), 327-336.
doi: 10.1037/a0021983
pmid: 21261409
|
[27] |
Oostrom, J. K., Born, M. P., Serlie, A. W., & van der Molen, H. T. (2010). Webcam testing: Validation of an innovative open-ended multimedia test. European Journal of Work and Organizational Psychology, 19(5), 532-550.
|
[28] |
Oostrom, J. K., Born, M. P., Serlie, A. W., & van der Molen, H. T. (2011). A multimedia situational test with a constructed-response format: Its relationship with personality, cognitive ability, job experience, and academic performance. Journal of Personnel Psychology, 10(2), 78-88.
|
[29] |
Oostrom, J. K., Born, M. P., Serlie, A. W., & van der Molen, H. T. (2012). Implicit trait policies in multimedia situational judgment tests for leadership skills: Can they predict leadership behavior? Human Performance, 25(4), 335-353.
|
[30] |
Pang, N., Zhao, X., Wang, W., Xiao, W., & Guo, D. (2021). Few-shot text classification by leveraging bi-directional attention and cross-class knowledge. Science China Information Sciences. 64(3), 130103.
|
[31] |
Qi, S. Q., & Dai, H. Q. (2003). The property, function and the development of situational judgment tests. Psychological Exploration, 23(4), 42-46.
|
|
[漆书青, 戴海琦. (2003). 情景判断测验的性质、功能与开发编制. 心理学探新, 23(4), 42-46.]
|
[32] |
Ramesh, D., & Sanampudi, S. K. (2022). An automated essay scoring systems: A systematic literature review. Artificial Intelligence Review, 55(3), 2495-2527.
|
[33] |
Ramineni, C., Trapani, C. S., Williamson, D. M., David, T., & Bridgeman, B. (2012). Evaluation of the e-rater® scoring engine for the GRE® Issue and Argument Prompts. ETS Research Report Series, (1), i-106.
|
[34] |
Robson, S. M., Jones, A., & Abraham, J. (2007). Personality, faking, and convergent validity: A warning concerning warning statements. Human Performance, 21(1), 89-106.
|
[35] |
Rogers, W. T., & Harley, D. (1999). An empirical comparison of three-and four-choice items and tests: Susceptibility to testwiseness and internal consistency reliability. Educational and Psychological Measurement, 59(2), 234-247.
|
[36] |
Rudner, L. M., & Liang, T. (2002). Automated essay scoring using Bayes’ theorem. The Journal of Technology, Learning and Assessment, 1(2), 1-22.
|
[37] |
Slaughter, J. E., Christian, M. S., Podsakoff, N. P., Sinar, E. F., & Lievens, F. (2014). On the limitations of using situational judgment tests to measure interpersonal skills: The moderating influence of employee anger. Personnel Psychology, 67(4), 847-885.
|
[38] |
Süzen, N., Gorban, A. N., Levesley, J., & Mirkes, E. M. (2020). Automatic short answer grading and feed-back using text mining methods. Procedia Computer Science, 169, 726-743.
|
[39] |
Tavoosi, S. (2022). Development and validation of a counterproductive work behavior situational judgment test with an open-ended response format: A computerized scoring approach (Unpublished master’s thesis). University of Central Florida.
|
[40] |
Wang, Y., & Peng, H. L. (2019). Validation on automatic scoring for open-ended questions in Chinese oral tests. China Examinations, 9, 63-71.
|
|
[王妍, 彭恒利. (2019). 汉语口语开放性试题计算机自动评分的效度验证. 中国考试, 9, 63-71.]
|
[41] |
Weekley, J. A., & Ployhart, R. E. (2005). Situational judgment: Antecedents and relationships with performance. Human Performance, 18(1), 81-104.
|
[42] |
Whetzel, D. L., & McDaniel, M. A. (2009). Situational judgment tests: An overview of current research. Human Resource Management Review, 19(3), 188-202.
|
[43] |
Williamson, D. M., Xi, X., & Breyer, F. J. (2012). A framework for evaluation and use of automated scoring. Educational Measurement: Issues and Practice, 31(1), 2-13.
|
[44] |
Xie, X. Q. (2013). Validation: From reasonable to plausible interpretation of test score. China Examinations, 7, 3-8.
|
|
[谢小庆. (2013). 效度: 从分数的合理解释到可接受解释. 中国考试, 7, 3-8.]
|
[45] |
Xu, J. P. (2004). Research on teacher competency model and evaluation (Unpublished doctorial dissertation). Beijing Normal University.
|
|
[徐建平. (2004). 教师胜任力模型与测评研究 (博士学位论文). 北京师范大学.]
|
[46] |
Yang, L., Xin, T., Luo, F., Zhang, S., & Tian, X. (2022). Automated evaluation of the quality of ideas in compositions based on concept maps. Natural Language Engineering, 28(4), 449-486.
|
[47] |
Zhang, Y., Lin, C., & Chi, M. (2020). Going deeper: Automatic short-answer grading by combining student and question models. User Modeling and User-Adapted Interaction, 30(1), 51-80.
|
[48] |
Zhao, Y., Shen, Y., & Yao, J. (2019, August). Recurrent neural network for text classification with hierarchical multiscale dense connections. Proceedings of the 28th International Joint Conference on Artificial Intelligence, 5450-5456.
|