Linear programming method to construct equated item sets for the implementation of periodical computer-based testing for the Korean Medical Licensing Examination
- Author:
Dong Gi SEO
1
;
Myeong Gi KIM
;
Na Hui KIM
;
Hye Sook SHIN
;
Hyun Jung KIM
Author Information
- Publication Type:Original Article
- From:Journal of Educational Evaluation for Health Professions 2018;15():26-
- CountryRepublic of Korea
- Language:English
-
Abstract:
PURPOSE:This study aimed to identify the best way of developing equivalent item sets and to propose a stable and effective managementplan for periodical licensing examinations.
METHODS:Five pre-equated item sets were developed based on the predicted correct answer rate of each item using linear programming. These pre-equated item sets were compared to the ones that were developed with a random item selection method based on the actual correct answer rate (ACAR) and difficulty from item response theory (IRT). The results with and without common items were also compared in the same way. ACAR and the IRT difficulty were used to determine whether there was a significant difference between the pre-equating conditions.
RESULTS:There was a statistically significant difference in IRT difficulty among the results from different pre-equated conditions. The predicted correct answer rate was divided using 2 or 3 difficulty categories, and the ACAR and IRT difficulty parameters of the 5 item sets were equally constructed. Comparing the item set conditions with and without common items, including common items did not make a significant contribution to the equating of the 5 item sets.
CONCLUSION:This study suggested that the linear programming method is applicable to construct equated-item sets that reflect each content area. The suggested best method to construct equated item sets is to divide the predicted correct answer rate using 2 or 3 difficulty categories, regardless of common items. If pre-equated item sets are required to construct a test based on the actual data, several methods should be considered by simulation studies to determine which is optimal before administering a real test.