Introduction: This study analyzed the reproducibility of residency training evaluation forms revised in FY 2020 using EPOC2. Methods: Reproducibility was assessed by calculating the concordance rate and intraclass correlation coefficient (ICC) between evaluations from two clinical educators during the same clinical department rotation. Additionally, Bland-Altman plots were created to visualize the data. Results: Out of 13,184 residents at facilities using EPOC2, approximately 3,800 who were evaluated more than twice by clinical educators during the same training period were analyzed. The average concordance rates for items A, B, and C were 68.6%, 43.8%, and 57.6%, respectively, indicating variability in evaluations among clinical educators. ICC values were also low. Discussion: Our findings suggest that the reproducibility of evaluations was low and discrepancies among clinical educators’ assessments were evident. To improve reproducibility, we recommend increasing the number of evaluations by different clinical educators and strengthening clinical educator workshops.