1.Archetype Model-Driven Development Framework for EHR Web System.
Shinji KOBAYASHI ; Eizen KIMURA ; Ken ISHIHARA
Healthcare Informatics Research 2013;19(4):271-277
OBJECTIVES: This article describes the Web application framework for Electronic Health Records (EHRs) we have developed to reduce construction costs for EHR sytems. METHODS: The openEHR project has developed clinical model driven architecture for future-proof interoperable EHR systems. This project provides the specifications to standardize clinical domain model implementations, upon which the ISO/CEN 13606 standards are based. The reference implementation has been formally described in Eiffel. Moreover C# and Java implementations have been developed as reference. While scripting languages had been more popular because of their higher efficiency and faster development in recent years, they had not been involved in the openEHR implementations. From 2007, we have used the Ruby language and Ruby on Rails (RoR) as an agile development platform to implement EHR systems, which is in conformity with the openEHR specifications. RESULTS: We implemented almost all of the specifications, the Archetype Definition Language parser, and RoR scaffold generator from archetype. Although some problems have emerged, most of them have been resolved. CONCLUSIONS: We have provided an agile EHR Web framework, which can build up Web systems from archetype models using RoR. The feasibility of the archetype model to provide semantic interoperability of EHRs has been demonstrated and we have verified that that it is suitable for the construction of EHR systems.
Automatic Data Processing
;
Computing Methodologies
;
Electronic Health Records
;
Indonesia
;
Internet
;
Semantics
2.Mapping Drug Terms via Integration of a Retrieval-Augmented Generation Algorithm with a Large Language Model
Eizen KIMURA ; Yukinobu KAWAKAMI ; Shingo INOUE ; Ai OKAJIMA
Healthcare Informatics Research 2024;30(4):355-363
Objectives:
This study evaluated the efficacy of integrating a retrieval-augmented generation (RAG) model and a large language model (LLM) to improve the accuracy of drug name mapping across international vocabularies.
Methods:
Drug ingredient names were translated into English using the Japanese Accepted Names for Pharmaceuticals. Drug concepts were extracted from the standard vocabulary of OHDSI, and the accuracy of mappings between translated terms and RxNorm was assessed by vector similarity, using the BioBERT-generated embedded vectors as the baseline. Subsequently, we developed LLMs with RAG that distinguished the final candidates from the baseline. We assessed the efficacy of the LLM with RAG in candidate selection by comparing it with conventional methods based on vector similarity.
Results:
The evaluation metrics demonstrated the superior performance of the combined LLM + RAG over traditional vector similarity methods. Notably, the hit rates of the Mixtral 8x7b and GPT-3.5 models exceeded 90%, significantly outperforming the baseline rate of 64% across stratified groups of PO drugs, injections, and all interventions. Furthermore, the r-precision metric, which measures the alignment between model judgment and human evaluation, revealed a notable improvement in LLM performance, ranging from 41% to 50% compared to the baseline of 23%.
Conclusions
Integrating an RAG and an LLM outperformed conventional string comparison and embedding vector similarity techniques, offering a more refined approach to global drug information mapping.
3.Mapping Drug Terms via Integration of a Retrieval-Augmented Generation Algorithm with a Large Language Model
Eizen KIMURA ; Yukinobu KAWAKAMI ; Shingo INOUE ; Ai OKAJIMA
Healthcare Informatics Research 2024;30(4):355-363
Objectives:
This study evaluated the efficacy of integrating a retrieval-augmented generation (RAG) model and a large language model (LLM) to improve the accuracy of drug name mapping across international vocabularies.
Methods:
Drug ingredient names were translated into English using the Japanese Accepted Names for Pharmaceuticals. Drug concepts were extracted from the standard vocabulary of OHDSI, and the accuracy of mappings between translated terms and RxNorm was assessed by vector similarity, using the BioBERT-generated embedded vectors as the baseline. Subsequently, we developed LLMs with RAG that distinguished the final candidates from the baseline. We assessed the efficacy of the LLM with RAG in candidate selection by comparing it with conventional methods based on vector similarity.
Results:
The evaluation metrics demonstrated the superior performance of the combined LLM + RAG over traditional vector similarity methods. Notably, the hit rates of the Mixtral 8x7b and GPT-3.5 models exceeded 90%, significantly outperforming the baseline rate of 64% across stratified groups of PO drugs, injections, and all interventions. Furthermore, the r-precision metric, which measures the alignment between model judgment and human evaluation, revealed a notable improvement in LLM performance, ranging from 41% to 50% compared to the baseline of 23%.
Conclusions
Integrating an RAG and an LLM outperformed conventional string comparison and embedding vector similarity techniques, offering a more refined approach to global drug information mapping.
4.Mapping Drug Terms via Integration of a Retrieval-Augmented Generation Algorithm with a Large Language Model
Eizen KIMURA ; Yukinobu KAWAKAMI ; Shingo INOUE ; Ai OKAJIMA
Healthcare Informatics Research 2024;30(4):355-363
Objectives:
This study evaluated the efficacy of integrating a retrieval-augmented generation (RAG) model and a large language model (LLM) to improve the accuracy of drug name mapping across international vocabularies.
Methods:
Drug ingredient names were translated into English using the Japanese Accepted Names for Pharmaceuticals. Drug concepts were extracted from the standard vocabulary of OHDSI, and the accuracy of mappings between translated terms and RxNorm was assessed by vector similarity, using the BioBERT-generated embedded vectors as the baseline. Subsequently, we developed LLMs with RAG that distinguished the final candidates from the baseline. We assessed the efficacy of the LLM with RAG in candidate selection by comparing it with conventional methods based on vector similarity.
Results:
The evaluation metrics demonstrated the superior performance of the combined LLM + RAG over traditional vector similarity methods. Notably, the hit rates of the Mixtral 8x7b and GPT-3.5 models exceeded 90%, significantly outperforming the baseline rate of 64% across stratified groups of PO drugs, injections, and all interventions. Furthermore, the r-precision metric, which measures the alignment between model judgment and human evaluation, revealed a notable improvement in LLM performance, ranging from 41% to 50% compared to the baseline of 23%.
Conclusions
Integrating an RAG and an LLM outperformed conventional string comparison and embedding vector similarity techniques, offering a more refined approach to global drug information mapping.
5.Mapping Drug Terms via Integration of a Retrieval-Augmented Generation Algorithm with a Large Language Model
Eizen KIMURA ; Yukinobu KAWAKAMI ; Shingo INOUE ; Ai OKAJIMA
Healthcare Informatics Research 2024;30(4):355-363
Objectives:
This study evaluated the efficacy of integrating a retrieval-augmented generation (RAG) model and a large language model (LLM) to improve the accuracy of drug name mapping across international vocabularies.
Methods:
Drug ingredient names were translated into English using the Japanese Accepted Names for Pharmaceuticals. Drug concepts were extracted from the standard vocabulary of OHDSI, and the accuracy of mappings between translated terms and RxNorm was assessed by vector similarity, using the BioBERT-generated embedded vectors as the baseline. Subsequently, we developed LLMs with RAG that distinguished the final candidates from the baseline. We assessed the efficacy of the LLM with RAG in candidate selection by comparing it with conventional methods based on vector similarity.
Results:
The evaluation metrics demonstrated the superior performance of the combined LLM + RAG over traditional vector similarity methods. Notably, the hit rates of the Mixtral 8x7b and GPT-3.5 models exceeded 90%, significantly outperforming the baseline rate of 64% across stratified groups of PO drugs, injections, and all interventions. Furthermore, the r-precision metric, which measures the alignment between model judgment and human evaluation, revealed a notable improvement in LLM performance, ranging from 41% to 50% compared to the baseline of 23%.
Conclusions
Integrating an RAG and an LLM outperformed conventional string comparison and embedding vector similarity techniques, offering a more refined approach to global drug information mapping.
6.Education and household income and carotid intima-media thickness in Japan: baseline data from the Aidai Cohort Study in Yawatahama, Uchiko, Seiyo, and Ainan.
Yoshihiro MIYAKE ; Keiko TANAKA ; Hidenori SENBA ; Yasuko HASEBE ; Toyohisa MIYATA ; Takashi HIGAKI ; Eizen KIMURA ; Bunzo MATSUURA ; Ryuichi KAWAMOTO
Environmental Health and Preventive Medicine 2021;26(1):88-88
BACKGROUND:
Epidemiological evidence for the relationship between education and income and carotid intima-media thickness (CIMT) has been limited and inconsistent. The present cross-sectional study investigated this issue using baseline data from the Aidai Cohort Study.
METHODS:
Study subjects were 2012 Japanese men and women aged 34-88 years. Right and left CIMT were measured at the common carotid artery using an automated carotid ultrasonography device. Maximum CIMT was defined as the largest CIMT value in either the left or right common carotid artery. Carotid wall thickening was defined as a maximum CIMT value > 1.0 mm.
RESULTS:
The prevalence of carotid wall thickening was 13.0%. In participants under 60 years of age (n = 703) and in those aged 60 to 69 years (n = 837), neither education nor household income was associated with carotid wall thickening or with maximum CIMT. Among those aged 70 years or older (n = 472), however, higher educational level, but not household income, was independently related to a lower prevalence of carotid wall thickening: the multivariate-adjusted odds ratio for high vs. low educational level was 0.43 (95% confidence interval 0.21-0.83, p for trend = 0.01). A significant inverse association was observed between education, but not household income, and maximum CIMT (p for trend = 0.006).
CONCLUSIONS
Higher educational level may be associated with a lower prevalence of carotid wall thickening and a decrease in maximum CIMT only in participants aged 70 years or older.
Adult
;
Aged
;
Aged, 80 and over
;
Carotid Intima-Media Thickness
;
Cohort Studies
;
Cross-Sectional Studies
;
Educational Status
;
Female
;
Humans
;
Income
;
Japan/epidemiology*
;
Male
;
Middle Aged
;
Odds Ratio
;
Prevalence