Comparisons of item difficulty and passing scores by test equating in a basic medical education curriculum

Jung Eun HWANG; Na Jin KIM; Su Young KIM

Return

Comparisons of item difficulty and passing scores by test equating in a basic medical education curriculum

Author: Jung Eun HWANG ¹ ; Na Jin KIM ; Su Young KIM
Author Information

1. Department of Pathology, College of Medicine, The Catholic University of Korea, Seoul, Korea. suyoung.dr@gmail.com
Publication Type:Original Article
From:Korean Journal of Medical Education 2019;31(2):147-157
CountryRepublic of Korea
Language:English
Abstract: PURPOSE:Test equating studies in medical education have been conducted only for high-stake exams or to compare two tests given in a single course. Based on item response theory, we equated computer-based test (CBT) results from the basic medical education curriculum at the College of Medicine, the Catholic University of Korea and evaluated the validity of using fixed passing scores.
METHODS:We collected 232 CBTs (28,636 items) for 40 courses administered over a study period of 9 years. The final data used for test equating included 12 pairs of tests. After test equating, Wilcoxon rank-sum tests were utilized to identify changes in item difficulty between previous tests and subsequent tests. Then, we identified gaps between equated passing scores and actual passing scores in subsequent tests through an observed-score equating method.
RESULTS:The results of Wilcoxon rank-sum tests indicated that there were no significant differences in item difficulty distribution by year for seven pairs. In the other five pairs, however, the items were significantly more difficult in subsequent years than in previous years. Concerning the gaps between equated passing scores and actual passing scores, equated passing scores in 10 pairs were found to be lower than actual passing scores. In the other two pairs, equated passing scores were higher than actual passing scores.
CONCLUSION:Our results suggest that the item difficulty distributions of tests taught in the same course during successive terms can differ significantly. It may therefore be problematic to use fixed passing scores without considering this possibility.