ISSN 0439-755X
CN 11-1911/B

›› 2001, Vol. 33 ›› Issue (02): 97-103.

    Next Articles

PUTONGHUA TEST: FEASIBILITY OF TAPE-RECORDED MARKING, RELIABILITY AND ECONOMIC EFFICIENCY

Chang Lei Hau XitTai Ho Wai-Xit Wen Jian-Bing (De Chinese University of Hong Kong, Hong Kong) Wang Yuguang (Yunnan Normal University, Kunming 650092)   

  • Published:2001-04-25 Online:2001-04-25

Abstract: This article contains two generaizability studies of the State Putonghua test In the first study, we examined the consistency between the live and tape, recorded assessment of Putonghua. Twenty five examinees participated in the first study. There were eight raters divided into four panels of two each. Five examinees were assigned into one of the four panels. The live assessment in the four panels took Place simultaneously. During the live assessment examinees were tape recorded. Each examinee's tape was later assessed by all eight raters. Standard assessment instrument (prepared by the State Language Commission) was used. For the purpose of this study, all examinees received the same items. The items were rated on a three-point scale where 0 = no credit 1 = partial credit and 2 = full credit. The objects of measurement were examinees (e) who were nested in panels (P). The raters (r) who were also nested in panels were crossed with examinees. Items (i) and mode (m) of assessment (i.e., recorded versus live assessment) were crossed with the rest of the conditions. The G study design was (exr):pxixm. Except for the mode facet which was considered fixed, all other facets and the objects of measurement were assumed random. This special G study focused on determining the consistency between the live and recorded assessment. The results indicated a relatively high degree of consistency. The signal-to-noise ratio reached 0.80, meaning that 80% of the absolute domain status differences in Putonghua were exchangeable between these two modes of assessment. The purpose of the second study was to determine an efficient tape recording assessment procedure to be adopted in the future. The objective was to employ an efficient number of raters and items in measuring Putonghua which will maximize reliablity and minimize costs. The second study adopted a fully crossed design so that unique variance could be estimated. Tapes of 25 examinees were each rated by the same six raters on 50 single-word items. The G study design was exrxi. Among the seven variance components, the largest was associared with the item facet, indicating the importance of samping more items. Using two raters and 100 items will achieve a satisfastory reliability of 0.90 and 0.84 for norm and domain referenced use of the test.

Key words: NULL