1.Feasibility of fully automated classification of whole slide images based on deep learning
Kyung Ok CHO ; Sung Hak LEE ; Hyun Jong JANG
The Korean Journal of Physiology and Pharmacology 2020;24(1):89-99
Although microscopic analysis of tissue slides has been the basis for disease diagnosis for decades, intra- and inter-observer variabilities remain issues to be resolved. The recent introduction of digital scanners has allowed for using deep learning in the analysis of tissue images because many whole slide images (WSIs) are accessible to researchers. In the present study, we investigated the possibility of a deep learning-based, fully automated, computer-aided diagnosis system with WSIs from a stomach adenocarcinoma dataset. Three different convolutional neural network architectures were tested to determine the better architecture for tissue classifier. Each network was trained to classify small tissue patches into normal or tumor. Based on the patch-level classification, tumor probability heatmaps can be overlaid on tissue images. We observed three different tissue patterns, including clear normal, clear tumor and ambiguous cases. We suggest that longer inspection time can be assigned to ambiguous cases compared to clear normal cases, increasing the accuracy and efficiency of histopathologic diagnosis by pre-evaluating the status of the WSIs. When the classifier was tested with completely different WSI dataset, the performance was not optimal because of the different tissue preparation quality. By including a small amount of data from the new dataset for training, the performance for the new dataset was much enhanced. These results indicated that WSI dataset should include tissues prepared from many different preparation conditions to construct a generalized tissue classifier. Thus, multi-national/multi-center dataset should be built for the application of deep learning in the real world medical practice.
Adenocarcinoma
;
Classification
;
Dataset
;
Diagnosis
;
Learning
;
Observer Variation
;
Stomach
3.Development and Validation of a Deep Learning System for Segmentation of Abdominal Muscle and Fat on Computed Tomography
Hyo Jung PARK ; Yongbin SHIN ; Jisuk PARK ; Hyosang KIM ; In Seob LEE ; Dong Woo SEO ; Jimi HUH ; Tae Young LEE ; TaeYong PARK ; Jeongjin LEE ; Kyung Won KIM
Korean Journal of Radiology 2020;21(1):88-100
dataset of 883 CT scans from 467 subjects. Axial CT images obtained at the inferior endplate level of the 3rd lumbar vertebra were used for the analysis. Manually drawn segmentation maps of the skeletal muscle, visceral fat, and subcutaneous fat were created to serve as ground truth data. The performance of the fully convolutional network-based segmentation system was evaluated using the Dice similarity coefficient and cross-sectional area error, for both a separate internal validation dataset (426 CT scans from 308 subjects) and an external validation dataset (171 CT scans from 171 subjects from two outside hospitals).RESULTS: The mean Dice similarity coefficients for muscle, subcutaneous fat, and visceral fat were high for both the internal (0.96, 0.97, and 0.97, respectively) and external (0.97, 0.97, and 0.97, respectively) validation datasets, while the mean cross-sectional area errors for muscle, subcutaneous fat, and visceral fat were low for both internal (2.1%, 3.8%, and 1.8%, respectively) and external (2.7%, 4.6%, and 2.3%, respectively) validation datasets.CONCLUSION: The fully convolutional network-based segmentation system exhibited high performance and accuracy in the automatic segmentation of abdominal muscle and fat on CT images.]]>
Abdominal Muscles
;
Adipose Tissue
;
Artificial Intelligence
;
Dataset
;
Intra-Abdominal Fat
;
Learning
;
Muscle, Skeletal
;
Muscles
;
Sarcopenia
;
Spine
;
Subcutaneous Fat
;
Tomography, X-Ray Computed
4.Involvement of the TNF-α Pathway in TKI Resistance and Suggestion of TNFR1 as a Predictive Biomarker for TKI Responsiveness in Clear Cell Renal Cell Carcinoma
Hee Sang HWANG ; Yun Yong PARK ; Su Jin SHIN ; Heounjeong GO ; Ja Min PARK ; Sun Young YOON ; Jae Lyun LEE ; Yong Mee CHO
Journal of Korean Medical Science 2020;35(5):31-
dataset from patient-derived xenograft model for TKI-treated ccRCC (GSE76068) was retrieved. Commonly altered pathways between the datasets were investigated by Ingenuity Pathway Analysis using commonly regulated differently expressed genes (DEGs). The significance of candidate DEG on intrinsic TKI resistance was assessed through immunohistochemistry in a separate cohort of 101 TKI-treated ccRCC cases.RESULTS: TNFRSF1A gene expression and tumor necrosis factor (TNF)-α pathway were upregulated in ccRCCs with acquired TKI resistance in both microarray datasets. Also, high expression (> 10% of labeled tumor cells) of TNF receptor 1 (TNFR1), the protein product of TNFRSF1A gene, was correlated with sarcomatoid dedifferentiation and was an independent predictive factor of clinically unfavorable response and shorter survivals in separated TKI-treated ccRCC cohort.CONCLUSION: TNF-α signaling may play a role in TKI resistance, and TNFR1 expression may serve as a predictive biomarker for clinically unfavorable TKI responses in ccRCC.]]>
Biomarkers
;
Carcinoma, Renal Cell
;
Cohort Studies
;
Dataset
;
Drug Resistance
;
Gene Expression
;
Gene Expression Profiling
;
Heterografts
;
Humans
;
Immunohistochemistry
;
Protein-Tyrosine Kinases
;
Receptors, Tumor Necrosis Factor
;
Receptors, Tumor Necrosis Factor, Type I
;
Tumor Necrosis Factor-alpha
5.Pathologic discrepancies between colposcopy-directed biopsy and loop electrosurgical excision procedure of the uterine cervix in women with cytologic high-grade squamous intraepithelial lesions
Se Ik KIM ; Se Jeong KIM ; Dong Hoon SUH ; Kidong KIM ; Jae Hong NO ; Yong Beom KIM
Journal of Gynecologic Oncology 2020;31(2):13-
OBJECTIVE: To investigate pathologic discrepancies between colposcopy-directed biopsy (CDB) of the cervix and loop electrosurgical excision procedure (LEEP) in women with cytologic high-grade squamous intraepithelial lesions (HSILs).METHODS: We retrospectively identified 297 patients who underwent both CDB and LEEP for HSILs in cervical cytology between 2015 and 2018, and compared their pathologic results. Considering the LEEP to be the gold standard, we evaluated the diagnostic performance of CDB for identifying cervical intraepithelial neoplasia (CIN) grades 2 and 3, adenocarcinoma in situ, and cancer (HSIL+). We also performed age subgroup analyses.RESULTS: Among the study population, 90.9% (270/297) had pathologic HSIL+ using the LEEP. The diagnostic performance of CDB for identifying HSIL+ was as follows: sensitivity, 87.8%; specificity, 59.3%; balanced accuracy, 73.6%; positive predictive value, 95.6%; and negative predictive value, 32.7%. Thirty-three false negative cases of CDB included CIN2,3 (n=29) and cervical cancer (n=4). The pathologic HSIL+ rate in patients with HSIL− by CDB was 67.3% (33/49). CDB exhibited a significant difference in the diagnosis of HSIL+ compared to LEEP in all patients (p<0.001). In age subgroup analyses, age groups <35 years and 35–50 years showed good agreement with the entire data set (p=0.496 and p=0.406, respectively), while age group ≥50 years did not (p=0.036).CONCLUSION: A significant pathologic discrepancy was observed between CDB and LEEP results in women with cytologic HSILs. The diagnostic inaccuracy of CDB increased in those ≥50 years of age.
Adenocarcinoma in Situ
;
Biopsy
;
Cervical Intraepithelial Neoplasia
;
Cervix Uteri
;
Colposcopy
;
Conization
;
Dataset
;
Diagnosis
;
Early Detection of Cancer
;
Female
;
Humans
;
Papanicolaou Test
;
Retrospective Studies
;
Sensitivity and Specificity
;
Squamous Intraepithelial Lesions of the Cervix
;
Uterine Cervical Neoplasms
6.Nationwide Cross-sectional Study of Association between Pterygium and Alkaline Phosphatase in a Population from Korea
Hyun Joon KIM ; Sang Hoon RAH ; Sun Woong KIM ; Soo Han KIM
Journal of the Korean Ophthalmological Society 2020;61(1):9-16
PURPOSE: We determined whether elevated serum alkaline phosphatase (ALP) was related to prevalence, location, type, length, and recurrence of pterygium in a population from the Republic of Korea.METHODS: A nationwide cross-sectional dataset, the Korean National Health and Nutrition Examination Survey (2008–2011), was used in this study. All participants were > 30 years of age and underwent the ALP test and ophthalmic evaluation (n = 22,359). One-way analysis of variance, the chi-square test, and Fisher's exact test were used to compare characteristics and outcomes among participants. Multivariable logistic regression was used to examine the possible associations between serum ALP levels and various types of pterygium. Data were adjusted for known risk factors for development of pterygium and ALP elevation (age, sex, residence, sunlight exposure, drinking, smoking, hypertension, diabetes, BMI, AST, ALT, vitamin D, and HDL).RESULTS: The overall prevalence of pterygium was 8.1%, and participants with pterygium had higher levels of serum ALP (p < 0.001). Participants with higher serum ALP had a significantly higher prevalence of all types of pterygium than those in the lower serum ALP quartiles. After adjusting for potential confounding factors, multivariate logistic regression analysis revealed that ALP was associated with the prevalence of pterygium (odds ratio [OR], 1.001; p = 0.038). Trend analysis between the OR and ALP quartiles revealed a linear trend in overall prevalence and in the intermediate type of pterygium. Subgroup analysis revealed a stronger correlation in participants > 50 years of age. One-way analysis of variance revealed an association between the size of pterygium and serum ALP quartile levels. Serum ALP was not associated with recurrence of pterygium.CONCLUSIONS: Increased serum ALP was associated with the prevalence and size of pterygium.
Alkaline Phosphatase
;
Cross-Sectional Studies
;
Dataset
;
Drinking
;
Hypertension
;
Korea
;
Logistic Models
;
Nutrition Surveys
;
Prevalence
;
Pterygium
;
Recurrence
;
Republic of Korea
;
Risk Factors
;
Smoke
;
Smoking
;
Sunlight
;
Vitamin D
7.Dual deep neural network-based classifiers to detect experimental seizures.
The Korean Journal of Physiology and Pharmacology 2019;23(2):131-139
Manually reviewing electroencephalograms (EEGs) is labor-intensive and demands automated seizure detection systems. To construct an efficient and robust event detector for experimental seizures from continuous EEG monitoring, we combined spectral analysis and deep neural networks. A deep neural network was trained to discriminate periodograms of 5-sec EEG segments from annotated convulsive seizures and the pre- and post-EEG segments. To use the entire EEG for training, a second network was trained with non-seizure EEGs that were misclassified as seizures by the first network. By sequentially applying the dual deep neural networks and simple pre- and post-processing, our autodetector identified all seizure events in 4,272 h of test EEG traces, with only 6 false positive events, corresponding to 100% sensitivity and 98% positive predictive value. Moreover, with pre-processing to reduce the computational burden, scanning and classifying 8,977 h of training and test EEG datasets took only 2.28 h with a personal computer. These results demonstrate that combining a basic feature extractor with dual deep neural networks and rule-based pre- and post-processing can detect convulsive seizures with great accuracy and low computational burden, highlighting the feasibility of our automated seizure detection algorithm.
Animals
;
Dataset
;
Electroencephalography
;
Epilepsy
;
Mice
;
Microcomputers
;
Seizures*
8.Insights into the signal transduction pathways of mouse lung type II cells revealed by transcription factor profiling in the transcriptome
Genomics & Informatics 2019;17(1):e8-
Alveolar type II cells constitute a small fraction of the total lung cell mass. However, they play an important role in many cellular processes including trans-differentiation into type I cells as well as repair of lung injury in response to toxic chemicals and respiratory pathogens. Transcription factors are the regulatory proteins dynamically modulating DNA structure and gene expression. Transcription factor profiling in microarray datasets revealed that several members of AP1, ATF, NF-kB, and C/EBP families involved in diverse responses were expressed in mouse lung type II cells. A transcriptional factor signature consisting of Cebpa, Srebf1, Stat3, Klf5, and Elf3 was identified in lung type II cells, Sox9+ pluripotent lung stem cells as well as in mouse lung development. Identification of the transcription factor profile in mouse lung type II cells will serve as a useful resource and facilitate the integrated analysis of signal transduction pathways and specific gene targets in a variety of physiological conditions.
Animals
;
Dataset
;
DNA
;
Gene Expression
;
Humans
;
Lung Injury
;
Lung
;
Mice
;
NF-kappa B
;
Signal Transduction
;
Stem Cells
;
Transcription Factors
;
Transcriptome
9.Improving the CONTES method for normalizing biomedical text entities with concepts from an ontology with (almost) no training data
Arnaud FERRÉ ; Mouhamadou BA ; Robert BOSSY
Genomics & Informatics 2019;17(2):e20-
Entity normalization, or entity linking in the general domain, is an information extraction task that aims to annotate/bind multiple words/expressions in raw text with semantic references, such as concepts of an ontology. An ontology consists minimally of a formally organized vocabulary or hierarchy of terms, which captures knowledge of a domain. Presently, machine-learning methods, often coupled with distributional representations, achieve good performance. However, these require large training datasets, which are not always available, especially for tasks in specialized domains. CONTES (CONcept-TErm System) is a supervised method that addresses entity normalization with ontology concepts using small training datasets. CONTES has some limitations, such as it does not scale well with very large ontologies, it tends to overgeneralize predictions, and it lacks valid representations for the out-of-vocabulary words. Here, we propose to assess different methods to reduce the dimensionality in the representation of the ontology. We also propose to calibrate parameters in order to make the predictions more accurate, and to address the problem of out-of-vocabulary words, with a specific method.
Dataset
;
Information Storage and Retrieval
;
Methods
;
Semantics
;
Vocabulary
10.OryzaGP: rice gene and protein dataset for named-entity recognition
Pierre LARMANDE ; Huy DO ; Yue WANG
Genomics & Informatics 2019;17(2):e17-
Text mining has become an important research method in biology, with its original purpose to extract biological entities, such as genes, proteins and phenotypic traits, to extend knowledge from scientific papers. However, few thorough studies on text mining and application development, for plant molecular biology data, have been performed, especially for rice, resulting in a lack of datasets available to solve named-entity recognition tasks for this species. Since there are rare benchmarks available for rice, we faced various difficulties in exploiting advanced machine learning methods for accurate analysis of the rice literature. To evaluate several approaches to automatically extract information from gene/protein entities, we built a new dataset for rice as a benchmark. This dataset is composed of a set of titles and abstracts, extracted from scientific papers focusing on the rice species, and is downloaded from PubMed. During the 5th Biomedical Linked Annotation Hackathon, a portion of the dataset was uploaded to PubAnnotation for sharing. Our ultimate goal is to offer a shared task of rice gene/protein name recognition through the BioNLP Open Shared Tasks framework using the dataset, to facilitate an open comparison and evaluation of different approaches to the task.
Benchmarking
;
Biology
;
Data Mining
;
Dataset
;
Machine Learning
;
Methods
;
Molecular Biology
;
Natural Language Processing
;
Oryza
;
Plants

Result Analysis
Print
Save
E-mail