1.Rules of moxibustion for low back pain by ZHOU Meisheng based on data mining and knowledge graph technology.
Chi WANG ; Caifeng ZHU ; Jiayu ZHANG ; Bingyuan ZHOU ; Xiaoyu CHEN ; Le CHENG ; Miaomiao XIE ; Xuechun DING
Chinese Acupuncture & Moxibustion 2025;45(6):823-833
OBJECTIVE:
To analyze the rules of moxibustion for low back pain by ZHOU Meisheng by using data mining and knowledge graph technology.
METHODS:
Taking the medical cases of moxibustion for low back pain from ZHOU Meisheng's legacy manuscripts and existing works as the research objects, information on disease types, symptoms, tongue manifestations, pulse conditions, syndrome patterns, moxibustion methods and acupoints were collected. Frequency statistics and community analysis were conducted by the ancient and modern medical record cloud platform V 2.3.7, cluster analysis of high-frequency acupoints was performed by SPSS26.0, association rule analysis of high-frequency acupoints was performed by SPSS Modeler 18.0, and the generated linked data were imported into Cytoscape 3.9.1 for complex network analysis. Knowledge graph of moxibustion for low back pain by ZHOU Meisheng was constructed based on the results of data mining. The data storage and display of knowledge graph were realized through the Neo4j 3.5.25 graph database, and the Cypher query language was used for knowledge graph retrieval and discovery.
RESULTS:
A total of 219 medical cases were collected, involving 14 disease types, 85 related clinical symptoms, 5 related TCM syndrome types, and 6 moxibustion methods. The acupoints were mostly attributed to the governor vessel, the bladder meridian of foot-taiyang, non-meridian and non-acupoint areas. The core prescription of acupoints derived from complex network analysis included tender points, Shenshu (BL23), Jinsuo (GV8), Yinjiao (CV7), Yaoyangguan (GV3), Yanglingquan (GB34), which were largely coincides with high-frequency acupoints. Cluster analysis obtained 4 cluster combinations. Community analysis yielded 6 communities, each corresponding to different acupoints.The constructed knowledge graph contained 187 nodes and 696 relationships, by retrieving clinical elements related to low back pain, the disease-moxibustion association graph, disease-acupoint association graph, accompanying symptom-acupoint association graph and syndrome type-matching point association graph were obtained.
CONCLUSION
When treating low back pain with moxibustion, ZHOU Meisheng adopts the principle of promoting circulation, distinguishing diseases to determine the treatment, selecting acupoints according to the diseases, and matching points according to the symptoms.And taking tender points, Shenshu (BL23), Jinsuo (GV8), Yinjiao (CV7), Yaoyangguan (GV3), Yanglingquan (GB34) as core acupoints, combined with tenderness point selection, acupoint selection based on meridian and zangfu syndrome differentiation, "sunshine area" acupoint selection, yin-yang acupoint matching. Additionally, he skillfully employs special points, such as Zhongzhu (KI15) and ear tips, pays attention to the reform of moxibustion tools, and innovates the moxibustion techniques, using distinctive moxibustion tools and methods to treat low back pain.
Moxibustion/methods*
;
Humans
;
Data Mining
;
Low Back Pain/history*
;
Acupuncture Points
;
History, Ancient
;
Female
;
China
;
Male
;
Adult
;
Middle Aged
2.Applications of AI Technology in Radiation Safety and Protection.
Chinese Journal of Medical Instrumentation 2025;49(4):429-434
The application of artificial intelligence (AI) technology in the field of radiation protection is gradually becoming a key force driving industrial development. This paper focuses on the applications of AI technology in radiation monitoring, shielding design, and nuclear science data mining, and prospects its future development. Through real-time monitoring and predictive analysis, AI technology has significantly improved the accuracy and efficiency of radiation monitoring, optimized the selection and configuration of shielding materials, and effectively reduced radiation exposure. Additionally, AI's data mining capabilities provide powerful tools for nuclear reactor design and optimization, promoting innovation in nuclear and radiological sciences. Despite technical challenges and ethical issues such as accuracy, data processing, and algorithmic transparency, the application prospects of AI in radiation protection remain broad. This paper emphasizes the critical role of AI technology in enhancing medical safety and efficiency, and foresees more future innovations and applications in the field of radiation protection.
Radiation Protection/methods*
;
Artificial Intelligence
;
Radiation Monitoring
;
Humans
;
Data Mining
;
Algorithms
3.Intelligent mining, engineering, and de novo design of proteins.
Cui LIU ; Zhenkun SHI ; Hongwu MA ; Xiaoping LIAO
Chinese Journal of Biotechnology 2025;41(3):993-1010
Natural components serve the survival instincts of cells that are obtained through long-term evolution, while they often fail to meet the demands of engineered cells for efficiently performing biological functions in special industrial environments. Enzymes, as biological catalysts, play a key role in biosynthetic pathways, significantly enhancing the rate and selectivity of biochemical reactions. However, the catalytic efficiency, stability, substrate specificity, and tolerance of natural enzymes often fall short of industrial production requirements. Therefore, exploring and modifying enzymes to suit specific biomanufacturing processes has become crucial. In recent years, artificial intelligence (AI) has played an increasingly important role in the discovery, evaluation, engineering, and de novo design of proteins. AI can accelerate the discovery and optimization of proteins by analyzing large amounts of bioinformatics data and predicting protein functions and characteristics by machine learning and deep learning algorithms. Moreover, AI can assist researchers in designing new protein structures by simulating and predicting their performance under different conditions, providing guidance for protein design. This paper reviews the latest research advances in protein discovery, evaluation, engineering, and de novo design for biomanufacturing and explores the hot topics, challenges, and emerging technical methods in this field, aiming to provide guidance and inspiration for researchers in related fields.
Protein Engineering/methods*
;
Artificial Intelligence
;
Proteins/genetics*
;
Computational Biology
;
Machine Learning
;
Data Mining
;
Algorithms
;
Deep Learning
4.Data mining in traditional Chinese medicine product quality review.
Sheng ZHANG ; Hou-Liu CHEN ; Hai-Bin QU
China Journal of Chinese Materia Medica 2023;48(5):1264-1272
The traditional Chinese medicine(TCM) enterprises have accumulated a large amount of product quality review(PQR) data. Mining these data can reveal the hidden knowledge in production and helps improve pharmaceutical manufacturing technology. However, there are few studies involving the mining of PQR data and thus enterprises lack the guidance to analyze the data. This study proposed a method to mine the PQR data, which consisted of 4 functional modules: data collection and preprocessing, risk classification of variables, risk evaluation by batches, and the regression analysis of quality. Further, we carried out a case study of the formulation process of a TCM product to illustrate the method. In the case study, the data of 398 batches of products during 2019-2021 were collected, which contained 65 process variables. The risks of variables were classified according to the process performance index. The risk of each batch was analyzed through short-term and long-term evaluation, and the critical variables with the strongest impact on the product quality were identified by partial least square regression. The results showed that 1 variable and 13 batches were of high risk, and the critical process variable was the quality of the intermediates. The proposed method enables enterprises to comprehensively mine the PQR data and helps to enhance the process understanding and improve the quality control.
Medicine, Chinese Traditional
;
Drugs, Chinese Herbal
;
Data Mining/methods*
;
Quality Control
;
Technology, Pharmaceutical
5.Examining patterns of traditional Chinese medicine use in pediatric oncology: A systematic review, meta-analysis and data-mining study.
Chun Sing LAM ; Li Wen PENG ; Lok Sum YANG ; Ho Wing Janessa CHOU ; Chi-Kong LI ; Zhong ZUO ; Ho-Kee KOON ; Yin Ting CHEUNG
Journal of Integrative Medicine 2022;20(5):402-415
BACKGROUND:
Traditional Chinese medicine (TCM) is becoming a popular complementary approach in pediatric oncology. However, few or no meta-analyses have focused on clinical studies of the use of TCM in pediatric oncology.
OBJECTIVE:
We explored the patterns of TCM use and its efficacy in children with cancer, using a systematic review, meta-analysis and data mining study.
SEARCH STRATEGY:
We conducted a search of five English (Allied and Complementary Medicine Database, Embase, PubMed, Cochrane Central Register of Controlled Trials, and ClinicalTrials.gov) and four Chinese databases (Wanfang Data, China National Knowledge Infrastructure, Chinese Biomedical Literature Database, and VIP Chinese Science and Technology Periodicals Database) for clinical studies published before October 2021, using keywords related to "pediatric," "cancer," and "TCM."
INCLUSION CRITERIA:
We included studies which were randomized controlled trials (RCTs) or observational clinical studies, focused on patients aged < 19 years old who had been diagnosed with cancer, and included at least one group of subjects receiving TCM treatment.
DATA EXTRACTION AND ANALYSIS:
The methodological quality of RCTs and observational studies was assessed using the six-item Jadad scale and the Effective Public Healthcare Panacea Project Quality Assessment Tool, respectively. Meta-analysis was used to evaluate the efficacy of combining TCM with chemotherapy. Study outcomes included the treatment response rate and occurrence of cancer-related symptoms. Association rule mining (ARM) was used to investigate the associations among medicinal herbs and patient symptoms.
RESULTS:
The 54 studies included in this analysis were comprised of RCTs (63.0%) and observational studies (37.0%). Most RCTs focused on hematological malignancies (41.2%). The study outcomes included chemotherapy-induced toxicities (76.5%), infection rate (35.3%), and response, survival or relapse rate (23.5%). The methodological quality of most of the RCTs (82.4%) and observational studies (80.0%) was rated as "moderate." In studies of leukemia patients, adding TCM to conventional treatment significantly improved the clinical response rate (odds ratio [OR] = 2.55; 95% confidence interval [CI] = 1.49-4.36), lowered infection rate (OR = 0.23; 95% CI = 0.13-0.40), and reduced nausea and vomiting (OR = 0.13; 95% CI = 0.08-0.23). ARM showed that Radix Astragali, the most commonly used medicinal herb (58.0%), was associated with treating myelosuppression, gastrointestinal complications, and infection.
CONCLUSION
There is growing evidence that TCM is an effective adjuvant therapy for children with cancer. We proposed a checklist to improve the quality of TCM trials in pediatric oncology. Future work will examine the use of ARM techniques on real-world data to evaluate the efficacy of medicinal herbs and drug-herb interactions in children receiving TCM as a part of integrated cancer therapy.
Adult
;
Child
;
China
;
Combined Modality Therapy
;
Complementary Therapies
;
Data Mining
;
Drugs, Chinese Herbal/therapeutic use*
;
Humans
;
Medicine, Chinese Traditional/methods*
;
Observational Studies as Topic
;
Randomized Controlled Trials as Topic
;
Young Adult
6.Screening and identification of key genes ATP1B3 and ENAH in the progression of hepatocellular carcinoma: based on data mining and clinical validation.
Xue Jia YANG ; Yu Jie LI ; Deng Qiang WU ; Yi Li MA ; Su Fang ZHOU
Journal of Southern Medical University 2022;42(6):815-823
OBJECTIVE:
To explore the marker genes correlated with the prognosis, progression and clinical diagnosis of hepatocellular carcinoma (HCC) based on bioinformatics methods.
METHODS:
The TCGA-LIHC, GSE84432, GSE143233 and GSE63898 datasets from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) were analyzed. The differentially expressed genes (DEGs) shared by different disease types were obtained using GEO2R and edge R packages, and Gene Ontology (GO) and Kyoto Gene and Genome Encyclopedia (KEGG) enrichment analyses of the DEGs were performed. The expression levels of these DEGs in normal and cancerous tissues were verified in TCGA-LIHC to identify the upregulated genes in HCC. Survival analysis, receiver-operating characteristic (ROC) curve analysis, and correlation analysis between the key genes and the clinical features of the patients were carried out using the R language. The differential expressions of 15 key genes were verified in clinical samples of HCC and adjacent tissues using RT-qPCR.
RESULTS:
A total of 118 common DEGs were obtained in the database, and among them two genes, namely ATPase Na +/K + transport subunit beta 3 (ATP1B3) and actin regulator (ENAH), showed increased expressions with disease progression. Survival analysis combined with the TCGA-LIHC dataset suggested that high expressions of ATP1B3 and ENAH were both significantly correlated with a poor prognosis of HCC patients (P < 0.05), and their AUC values were 0.821 and 0.933, respectively. A high expression of ATP1B3 was correlated with T stage, pathological stage and pathological grade of the tumors (P < 0.05), while that of ENAH was associated only with an advanced tumor grade (P < 0.05). The results of RT-qPCR showed that ATP1B3 and ENAH were both significantly upregulated in clinical HCC tissues (P < 0.05).
CONCLUSION
ATPIB3 and ENAH are both upregulated in HCC, and their high expressions may serve as biomarkers of progression of liver diseases and a poor prognosis of HCC.
Carcinoma, Hepatocellular/pathology*
;
Data Mining
;
Gene Expression Profiling/methods*
;
Gene Expression Regulation, Neoplastic
;
Humans
;
Liver Neoplasms/pathology*
;
Microfilament Proteins/metabolism*
;
Sodium-Potassium-Exchanging ATPase/metabolism*
7.Health Information Technology Trends in Social Media: Using Twitter Data
Jisan LEE ; Jeongeun KIM ; Yeong Joo HONG ; Meihua PIAO ; Ahjung BYUN ; Healim SONG ; Hyeong Suk LEE
Healthcare Informatics Research 2019;25(2):99-105
OBJECTIVES: This study analyzed the health technology trends and sentiments of users using Twitter data in an attempt to examine the public's opinions and identify their needs. METHODS: Twitter data related to health technology, from January 2010 to October 2016, were collected. An ontology related to health technology was developed. Frequently occurring keywords were analyzed and visualized with the word cloud technique. The keywords were then reclassified and analyzed using the developed ontology and sentiment dictionary. Python and the R program were used for crawling, natural language processing, and sentiment analysis. RESULTS: In the developed ontology, the keywords are divided into ‘health technology‘ and ‘health information‘. Under health technology, there are are six subcategories, namely, health technology, wearable technology, biotechnology, mobile health, medical technology, and telemedicine. Under health information, there are four subcategories, namely, health information, privacy, clinical informatics, and consumer health informatics. The number of tweets about health technology has consistently increased since 2010; the number of posts in 2014 was double that in 2010, which was about 150 thousand posts. Posts about mHealth accounted for the majority, and the dominant words were ‘care‘, ‘new‘, ‘mental‘, and ‘fitness‘. Sentiment analysis by subcategory showed that most of the posts in nearly all subcategories had a positive tone with a positive score. CONCLUSIONS: Interests in mHealth have risen recently, and consequently, posts about mHealth were the most frequent. Examining social media users' responses to new health technology can be a useful method to understand the trends in rapidly evolving fields.
Biomedical Technology
;
Biotechnology
;
Boidae
;
Data Mining
;
Informatics
;
Medical Informatics
;
Methods
;
Natural Language Processing
;
Privacy
;
Public Opinion
;
Social Media
;
Telemedicine
8.Classification of Common Relationships Based on Short Tandem Repeat Profiles Using Data Mining
Su Jin JEONG ; Hyo Jung LEE ; Soong Deok LEE ; Seung Hwan LEE ; Su Jeong PARK ; Jong Sik KIM ; Jae Won LEE
Korean Journal of Legal Medicine 2019;43(3):97-105
We reviewed past studies on the identification of familial relationships using 22 short tandem repeat markers. As a result, we can obtain a high discrimination power and a relatively accurate cut-off value in parent-child and full sibling relationships. However, in the case of pairs of uncle-nephew or cousin, we found a limit of low discrimination power of the likelihood ratio (LR) method. Therefore, we compare the LR ranking method and data mining techniques (e.g., logistic regression, linear discriminant analysis, diagonal linear discriminant analysis, diagonal quadratic discriminant analysis, K-nearest neighbor, classification and regression trees, support vector machines, random forest [RF], and penalized multivariate analysis) that can be applied to identify familial relationships, and provide a guideline for choosing the most appropriate model under a given situation. RF, one of the data mining techniques, was found to be more accurate than other methods. The accuracy of RF is 99.99% for parent-child, 99.44% for full siblings, 90.34% for uncle-nephew, and 79.69% for first cousins.
Classification
;
Data Mining
;
Discrimination (Psychology)
;
Forests
;
Humans
;
Logistic Models
;
Methods
;
Microsatellite Repeats
;
Siblings
;
Support Vector Machine
;
Trees
9.OryzaGP: rice gene and protein dataset for named-entity recognition
Pierre LARMANDE ; Huy DO ; Yue WANG
Genomics & Informatics 2019;17(2):e17-
Text mining has become an important research method in biology, with its original purpose to extract biological entities, such as genes, proteins and phenotypic traits, to extend knowledge from scientific papers. However, few thorough studies on text mining and application development, for plant molecular biology data, have been performed, especially for rice, resulting in a lack of datasets available to solve named-entity recognition tasks for this species. Since there are rare benchmarks available for rice, we faced various difficulties in exploiting advanced machine learning methods for accurate analysis of the rice literature. To evaluate several approaches to automatically extract information from gene/protein entities, we built a new dataset for rice as a benchmark. This dataset is composed of a set of titles and abstracts, extracted from scientific papers focusing on the rice species, and is downloaded from PubMed. During the 5th Biomedical Linked Annotation Hackathon, a portion of the dataset was uploaded to PubAnnotation for sharing. Our ultimate goal is to offer a shared task of rice gene/protein name recognition through the BioNLP Open Shared Tasks framework using the dataset, to facilitate an open comparison and evaluation of different approaches to the task.
Benchmarking
;
Biology
;
Data Mining
;
Dataset
;
Machine Learning
;
Methods
;
Molecular Biology
;
Natural Language Processing
;
Oryza
;
Plants
10.Identification and Validation of Circulating MicroRNA Signatures for Breast Cancer Early Detection Based on Large Scale Tissue-Derived Data.
Xiaokang YU ; Jinsheng LIANG ; Jiarui XU ; Xingsong LI ; Shan XING ; Huilan LI ; Wanli LIU ; Dongdong LIU ; Jianhua XU ; Lizhen HUANG ; Hongli DU
Journal of Breast Cancer 2018;21(4):363-370
PURPOSE: Breast cancer is the most commonly occurring cancer among women worldwide, and therefore, improved approaches for its early detection are urgently needed. As microRNAs (miRNAs) are increasingly recognized as critical regulators in tumorigenesis and possess excellent stability in plasma, this study focused on using miRNAs to develop a method for identifying noninvasive biomarkers. METHODS: To discover critical candidates, differential expression analysis was performed on tissue-originated miRNA profiles of 409 early breast cancer patients and 87 healthy controls from The Cancer Genome Atlas database. We selected candidates from the differentially expressed miRNAs and then evaluated every possible molecular signature formed by the candidates. The best signature was validated in independent serum samples from 113 early breast cancer patients and 47 healthy controls using reverse transcription quantitative real-time polymerase chain reaction. RESULTS: The miRNA candidates in our method were revealed to be associated with breast cancer according to previous studies and showed potential as useful biomarkers. When validated in independent serum samples, the area under curve of the final miRNA signature (miR-21-3p, miR-21-5p, and miR-99a-5p) was 0.895. Diagnostic sensitivity and specificity were 97.9% and 73.5%, respectively. CONCLUSION: The present study established a novel and effective method to identify biomarkers for early breast cancer. And the method, is also suitable for other cancer types. Furthermore, a combination of three miRNAs was identified as a prospective biomarker for breast cancer early detection.
Area Under Curve
;
Biomarkers
;
Biomarkers, Tumor
;
Breast Neoplasms*
;
Breast*
;
Carcinogenesis
;
Data Mining
;
Early Detection of Cancer
;
Female
;
Genome
;
Humans
;
Methods
;
MicroRNAs*
;
Plasma
;
Prospective Studies
;
Real-Time Polymerase Chain Reaction
;
Reverse Transcription
;
Sensitivity and Specificity

Result Analysis
Print
Save
E-mail