ISSN 0439-755X
CN 11-1911/B
主办:中国心理学会
   中国科学院心理研究所
出版:科学出版社

心理学报 ›› 2013, Vol. 45 ›› Issue (6): 704-714.doi: 10.3724/SP.J.1041.2013.00704

• 论文 • 上一篇    

运用基因表达式编程的自陈量表数据建模

钱锦昕;余嘉元   

  1. (南京师范大学心理学院, 南京 210097)
  • 收稿日期:2012-11-19 出版日期:2013-06-25 发布日期:2013-06-25
  • 通讯作者: 余嘉元
  • 基金资助:

    国家社会科学基金教育学课题(BBA080050)、国家自然科学基金项目(71071065、71131004)、江苏省一级重点学科“心理学”资助成果。

Modeling Self-reported Instrument Data with Gene Expression Programming

QIAN Jinxin;YU Jiayuan   

  1. (School of Psychology, Nanjing Normal University, Nanjing 210097, China)
  • Received:2012-11-19 Online:2013-06-25 Published:2013-06-25
  • Contact: YU Jiayuan

摘要: 探讨基因表达式编程对自陈量表测量数据的建模方法。运用威廉斯创造力测验和认知需求量表获得400位中学生的测量分数, 通过数据清洗, 保留383个被试的分数作为建模的数据集。运用哈曼单因素检验方法没有发现共同方法偏差。采用均匀设计方法对基因表达式编程中的5个参数进行优化配置, 在测试拟合度最大的试验条件下, 找到了测试误差最小的模型。比较基因表达式编程和BP神经网络、支持向量回归机、多元线性回归、二次多项式回归所建模型的预测精度。研究表明, 基因表达式编程能用于自陈量表测量数据的建模, 该模型比传统方法所建的模型具有更高的预测精度, 而且模型是稳健的。

关键词: 自陈量表, 基因表达式编程, 建模, 创造力, 均匀设计

Abstract: It is often difficult to represent the complex relations among psychological variables with traditional analytical models like regressions. Supposedly, neural networks and support vector regression machine can be used instead. However, the limitation is that these models are recessive. Gene expression programming (GEP) can be used to handle these models with observable variables. At present, most of the data using GEP models are obtained with objective methods. But a lot of the psychological measurement data are obtained from self-report instruments and are affected by many subjective factors. Could these kinds of data be used in GEP models? How large is the modeling error? Is there any advantage in using the GEP modeling as compared with the multivariate linear regression or the polynomial regression modeling? Is the GEP modeling more accurate than neural networks and support vector regression machine modeling? All the above issues would be explored in this paper. The responses of 400 middle school students were obtained with the Williams creativity assessment packet and the need for cognition scale. A total of 17 students were deleted because of the abnormality in responses and the data from 383 students were retained for modeling. Common method biases had not been found with the Harman’s single-factor test. Five parameters of gene expression programming were optimized with the uniform design. These parameters were head length, gene number, fitness function, chromosome number and mutation probability. There were nine levels for each parameter, each established under different testing conditions respectively. The condition with maximum fitness was obtained through experiments. The GEP program was repeated 10 times under this condition. The accuracy of the models was calculated and the model with the minimum error was found, of which the expression tree was drawn. The models of the relations between need for cognition and creativity personality traits were established using BP neural networks, support vector regression machine, multivariate linear regression and polynomial regression respectively. These models were compared with the model using gene expression programming. The results showed that: (a) the accuracy of model 10, with four independent variables, was the highest; (b) the expressions of these ten models were different but their predictive errors were very close, thus supporting the robustness of the GEP modeling method; and (c) the predictive errors of different models were: GEP, 1.28; BP networks 2.76; support vector regression machine 2.31; polynomial regression 3.21; multivariate linear regression 3.86 respectively. It can be concluded that: (a) data from self-reported instruments can still be modeled with gene expression programming even though these data are affected by many subjective factors; (b) the GEP modeling is more accurate than the other intelligent computing methods (neural networks, support vector regression machine, etc.) and traditional statistical methods (multivariate linear regression, polynomial regression, etc.), and (c) the models established with GEP are robust; their predictive accuracy is similar even though their mathematical formulae are quite different.

Key words: self-reported instrument, gene expression programming, modeling, creativity, uniform design