1.Resources for assigning MeSH IDs to Japanese medical terms
Genomics & Informatics 2019;17(2):e16-
Medical Subject Headings (MeSH), a medical thesaurus created by the National Library of Medicine (NLM), is a useful resource for natural language processing (NLP). In this article, the current status of the Japanese version of Medical Subject Headings (MeSH) is reviewed. Online investigation found that Japanese-English dictionaries, which assign MeSH information to applicable terms, but use them for NLP, were found to be difficult to access, due to license restrictions. Here, we investigate an open-source Japanese-English glossary as an alternative method for assigning MeSH IDs to Japanese terms, to obtain preliminary data for NLP proof-of-concept.
Asian Continental Ancestry Group
;
Humans
;
Licensure
;
Medical Subject Headings
;
Methods
;
National Library of Medicine (U.S.)
;
Natural Language Processing
;
Vocabulary, Controlled
2.O-JMeSH: creating a bilingual English-Japanese controlled vocabulary of MeSH UIDs through machine translation and mutual information
Felipe SOARES ; Yuka TATEISI ; Terue TAKATSUKI ; Atsuko YAMAGUCHI
Genomics & Informatics 2021;19(3):e26-
Previous approaches to create a controlled vocabulary for Japanese have resorted to existing bilingual dictionary and transformation rules to allow such mappings. However, given the possible new terms introduced due to coronavirus disease 2019 (COVID-19) and the emphasis on respiratory and infection-related terms, coverage might not be guaranteed. We propose creating a Japanese bilingual controlled vocabulary based on MeSH terms assigned to COVID-19 related publications in this work. For such, we resorted to manual curation of several bilingual dictionaries and a computational approach based on machine translation of sentences containing such terms and the ranking of possible translations for the individual terms by mutual information. Our results show that we achieved nearly 99% occurrence coverage in LitCovid, while our computational approach presented average accuracy of 63.33% for all terms, and 84.51% for drugs and chemicals.
3.Constructing Japanese MeSH term dictionaries related to the COVID-19 literature
Atsuko YAMAGUCHI ; Terue TAKATSUKI ; Yuka TATEISI ; Felipe SOARES
Genomics & Informatics 2021;19(3):e25-
The coronavirus disease 2019 (COVID-19) pandemic has led to a flood of research papers and the information has been updated with considerable frequency. For society to derive benefits from this research, it is necessary to promote sharing up-to-date knowledge from these papers. However, because most research papers are written in English, it is difficult for people who are not familiar with English medical terms to obtain knowledge from them. To facilitate sharing knowledge from COVID-19 papers written in English for Japanese speakers, we tried to construct a dictionary with an open license by assigning Japanese terms to MeSH unique identifiers (UIDs) annotated to words in the texts of COVID-19 papers. Using this dictionary, 98.99% of all occurrences of MeSH terms in COVID-19 papers were covered. We also created a curated version of the dictionary and uploaded it to PubDictionary for wider use in the PubAnnotation system.