Statistical Analysis with the Item-Response Theory of the First Trial of the Computer-Based Nationwide Common Achievement Test in Medicine
- VernacularTitle:項目反応理論を用いた第1回共用試験医学系CBTの統計解析
- Author:
Yoshio NITTA
;
Shinichi MAEKAWA
;
Takemi YANAGIMOTO
;
Tadahiko MAEDA
;
Motofumi YOSHIDA
;
Nobuo NARA
;
Tatsuki ISHIDA
;
Osamu FUKUSHIMA
;
Nobuhiko SAITO
;
Yasuichiro FUKUDA
;
Fumimaro TAKAKU
;
Takeshi ASO
- Publication Type:Journal Article
- Keywords:
common achievement test;
computer-based testing;
item-response theory;
classical test theory
- From:Medical Education
2005;36(1):3-9
- CountryJapan
- Language:Japanese
-
Abstract:
Data from the first trial of the computer-based nationwide common achievement test in medicine, carried out from February through July in 2002, were analyzed to evaluate the applicability of the item-response theory. The trial test was designed to cover 6 areas of the core curriculum and included a total of 2791 items. For each area, 3 to 40 items were chosen randomly and administered to 5693 students in the fourth to sixth years; the responses of 5676 of these students were analyzed with specifically designed computer systems. Each student was presented with 100 items. The itemresponse patterns were analyzed with a 3-parameter logistic model (item discrimination, item difficulty, and guessing parameter). The main findings were: 1) Item difficulty and the percentage of correct answers were strongly correlated (r=-0.969to-0.982). 2) Item discrimination and the point-biserial correlation were moderately strongly correlated (r=0.304 to 0.511). 3) The estimated abilities and the percentage of correct answers were strongly correlated (r=0.810 to 0.945). 4) The mean ability increased with school year. 5) The correlation coefficients among the 6 curriculum area ability scores were less than 0.6. Because the nationwide common achievement test was designed to randomly present items to each student, the item-response theory can be used to adjust the differences among test sets. The first trial test was designed without considering the item-response theory, but the second trial test was administered with a design better suited for comparison. Results of an analysis of the second trial will be reported soon.