1.BioCovi: A Visualization Service for Comparative Genomics Analysis.
Jungsul LEE ; Daeui PARK ; Jong BHAK
Genomics & Informatics 2005;3(2):52-54
Visualization of the homology information is an important method to analyze the evolutionary and functional meanings of genes. With a database containing model genomes of Homo sapiens, Mus muculus, and Rattus norvegicus, we constructed a web-based comparative analysis tool, BioCovi, to visualize the homology information of mammalian sequences on a very large scale. The user interface has several features: it marks regions whose identity is greater than that specified, it shows or hides gaps from the result of global sequence alignment, and it inverts the graph when total identity is higher than the threshold specified.
Animals
;
Genome
;
Genomics*
;
Humans
;
Mice
;
Rats
;
Sequence Alignment
2.Post-GWAS Strategies.
Genomics & Informatics 2011;9(1):1-4
Genome-wide association (GWA) studies are the method of choice for discovering loci associated with common diseases. More than a thousand GWA studies have reported successful identification of statistically significant association signals in human genomes for a variety of complex diseases. In this review, I discuss some of the issues related to the future of GWA studies and their biomedical applications.
Genome, Human
;
Genome-Wide Association Study
;
Humans
3.HExDB: Human EXon DataBase for Alternative Splicing Pattern Analysis.
Junghwan PARK ; Minho LEE ; Jong BHAK
Genomics & Informatics 2005;3(3):80-85
HExDB is a database for analyzing exon and splicing pattern information in Homo sapiens. HExDB is useful for specific purposes: 1) to design primers for exon amplification from cDNA and 2) to understand the change of ORFs by alternative splicing. HExDB was constructed by integrating data from AltExtron which is the computationally predicted exon database, Ensemble cDNA annotation, and Affymetrix genome tile published recently. Although it may contain false positive data, HExDB is good starting point due to its sensitivity. At present, there are as many as 2,046,519 exons stored in the HExDB. We found that 16.8% of the exons in the database was constitutive exons and 83.1% were novel gene exons.
Alternative Splicing*
;
Animals
;
DNA, Complementary
;
Ecthyma, Contagious
;
Exons*
;
Genome
;
Humans*
;
Open Reading Frames
4.BioSubroutine: an Open Web Server for Bioinformatics Algorithms and Subroutines.
Joowon LEE ; Hana KIM ; Wonhye LEE ; Dongil CHUNG ; Jong BHAK
Genomics & Informatics 2005;3(1):35-38
We present BioSubroutine, an open depository server that automatically categorizes various subroutines frequently used in bioinformatics research. We processed a large bioinformatics subroutine library called Bio.pl that was the first Bioperl subroutine library built in 1995. Over 1000 subroutines were processed automatically and an HTML interface has been created. BioSubroutine can accept new subroutines and algorithms from any such subroutine library, as well as provide interactive user forms. The subroutines are stored in an SQL database for quick searching and accessing. BioSubroutine is an open access project under the BioLicense license scheme.
Computational Biology*
;
Licensure
5.Personal Genomics, Bioinformatics, and Variomics.
Jong BHAK ; Ho GHANG ; Rohit REJA ; Sangsoo KIM
Genomics & Informatics 2008;6(4):161-165
In 2008 at least five complete genome sequences are available. It is known that there are over 15,000,000 genetic variants, called SNPs, in the dbSNP database. The cost of full genome sequencing in 2009 is claimed to be less than $5000 USD. The genomics era has arrived in 2008. This review introduces technologies, bioinformatics, genomics visions, and variomics projects. Variomics is the study of the total genetic variation in an individual and populations. Research on genetic variation is the most valuable among many genomics research branches. Genomics and variomics projects will change biology and the society so dramatically that biology will become an everyday technology like personal computers and the internet. 'BioRevolution' is the term that can adequately describe this change.
Biology
;
Computational Biology
;
Genetic Variation
;
Genome
;
Genomics
;
Humans
;
Internet
;
Microcomputers
;
Polymorphism, Single Nucleotide
;
Vision, Ocular
6.A Clinical Risk Score to Predict In-hospital Mortality from COVID-19 in South Korea
Ae-Young HER ; Youngjune BHAK ; Eun Jung JUN ; Song Lin YUAN ; Scot GARG ; Semin LEE ; Jong BHAK ; Eun-Seok SHIN
Journal of Korean Medical Science 2021;36(15):e108-
Background:
Early identification of patients with coronavirus disease 2019 (COVID-19) who are at high risk of mortality is of vital importance for appropriate clinical decision making and delivering optimal treatment. We aimed to develop and validate a clinical risk score for predicting mortality at the time of admission of patients hospitalized with COVID-19.
Methods:
Collaborating with the Korea Centers for Disease Control and Prevention (KCDC), we established a prospective consecutive cohort of 5,628 patients with confirmed COVID-19 infection who were admitted to 120 hospitals in Korea between January 20, 2020, and April 30, 2020. The cohort was randomly divided using a 7:3 ratio into a development (n = 3,940) and validation (n = 1,688) set. Clinical information and complete blood count (CBC) detected at admission were investigated using Least Absolute Shrinkage and Selection Operator (LASSO) and logistic regression to construct a predictive risk score (COVID-Mortality Score).The discriminative power of the risk model was assessed by calculating the area under the curve (AUC) of the receiver operating characteristic curves.
Results:
The incidence of mortality was 4.3% in both the development and validation set.A COVID-Mortality Score consisting of age, sex, body mass index, combined comorbidity, clinical symptoms, and CBC was developed. AUCs of the scoring system were 0.96 (95% confidence interval [CI], 0.85–0.91) and 0.97 (95% CI, 0.84–0.93) in the development and validation set, respectively. If the model was optimized for > 90% sensitivity, accuracies were 81.0% and 80.2% with sensitivities of 91.7% and 86.1% in the development and validation set, respectively. The optimized scoring system has been applied to the public online risk calculator (https://www.diseaseriskscore.com).
Conclusion
This clinically developed and validated COVID-Mortality Score, using clinical data available at the time of admission, will aid clinicians in predicting in-hospital mortality.
7.A Clinical Risk Score to Predict In-hospital Mortality from COVID-19 in South Korea
Ae-Young HER ; Youngjune BHAK ; Eun Jung JUN ; Song Lin YUAN ; Scot GARG ; Semin LEE ; Jong BHAK ; Eun-Seok SHIN
Journal of Korean Medical Science 2021;36(15):e108-
Background:
Early identification of patients with coronavirus disease 2019 (COVID-19) who are at high risk of mortality is of vital importance for appropriate clinical decision making and delivering optimal treatment. We aimed to develop and validate a clinical risk score for predicting mortality at the time of admission of patients hospitalized with COVID-19.
Methods:
Collaborating with the Korea Centers for Disease Control and Prevention (KCDC), we established a prospective consecutive cohort of 5,628 patients with confirmed COVID-19 infection who were admitted to 120 hospitals in Korea between January 20, 2020, and April 30, 2020. The cohort was randomly divided using a 7:3 ratio into a development (n = 3,940) and validation (n = 1,688) set. Clinical information and complete blood count (CBC) detected at admission were investigated using Least Absolute Shrinkage and Selection Operator (LASSO) and logistic regression to construct a predictive risk score (COVID-Mortality Score).The discriminative power of the risk model was assessed by calculating the area under the curve (AUC) of the receiver operating characteristic curves.
Results:
The incidence of mortality was 4.3% in both the development and validation set.A COVID-Mortality Score consisting of age, sex, body mass index, combined comorbidity, clinical symptoms, and CBC was developed. AUCs of the scoring system were 0.96 (95% confidence interval [CI], 0.85–0.91) and 0.97 (95% CI, 0.84–0.93) in the development and validation set, respectively. If the model was optimized for > 90% sensitivity, accuracies were 81.0% and 80.2% with sensitivities of 91.7% and 86.1% in the development and validation set, respectively. The optimized scoring system has been applied to the public online risk calculator (https://www.diseaseriskscore.com).
Conclusion
This clinically developed and validated COVID-Mortality Score, using clinical data available at the time of admission, will aid clinicians in predicting in-hospital mortality.
8.Biological Object Downloader (BOD) Service for Easy Download and Management of Biological Databases.
Daeui PARK ; Jungwoo LEE ; Giseok YOON ; Sungsam GONG ; Jong BHAK
Genomics & Informatics 2007;5(4):196-199
BOD is an FTP service management tool on the Internet. It was developed for biological researchers in South Korea. It enables easier and faster access of bioinformation without having to go through foreign FTP sites. BOD includes an automatic downloader with a management and email alert service from which the user can easily select and schedule any biological database. Once listed in BOD, the user can check and modify the download status and data from an additional email alert service.
Appointments and Schedules
;
Electronic Mail
;
Internet
;
Korea
9.New Lung Cancer Panel for High-Throughput Targeted Resequencing.
Eun Hye KIM ; Sunghoon LEE ; Jongsun PARK ; Kyusang LEE ; Jong BHAK ; Byung Chul KIM
Genomics & Informatics 2014;12(2):50-57
We present a new next-generation sequencing-based method to identify somatic mutations of lung cancer. It is a comprehensive mutation profiling protocol to detect somatic mutations in 30 genes found frequently in lung adenocarcinoma. The total length of the target regions is 107 kb, and a capture assay was designed to cover 99% of it. This method exhibited about 97% mean coverage at 30x sequencing depth and 42% average specificity when sequencing of more than 3.25 Gb was carried out for the normal sample. We discovered 513 variations from targeted exome sequencing of lung cancer cells, which is 3.9-fold higher than in the normal sample. The variations in cancer cells included previously reported somatic mutations in the COSMIC database, such as variations in TP53, KRAS, and STK11 of sample H-23 and in EGFR of sample H-1650, especially with more than 1,000x coverage. Among the somatic mutations, up to 91% of single nucleotide polymorphisms from the two cancer samples were validated by DNA microarray-based genotyping. Our results demonstrated the feasibility of high-throughput mutation profiling with lung adenocarcinoma samples, and the profiling method can be used as a robust and effective protocol for somatic variant screening.
Adenocarcinoma
;
DNA
;
Exome
;
High-Throughput Nucleotide Sequencing
;
Lung
;
Lung Neoplasms*
;
Mass Screening
;
Polymorphism, Single Nucleotide
;
Sensitivity and Specificity
10.BioCC: An Openfree Hypertext Bio Community Cluster for Biology.
Sungsam GONG ; Tae Hyung KIM ; Jungsu OH ; Jekeun KWON ; Su An CHO ; Dan BOLSER ; Jong BHAK
Genomics & Informatics 2006;4(3):125-128
We present an openfree hypertext (also known as wiki) web cluster called BioCC. BioCC is a novel wiki farm that lets researchers create hundreds of biological web sites. The web sites form an organic information network. The contents of all the sites on the BioCC wiki farm are modifiable by anonymous as well as registered users. This enables biologists with diverse backgrounds to form their own Internet bio-communities. Each community can have custom-made layouts for information, discussion, and knowledge exchange. BioCC aims to form an ever-expanding network of openfree biological knowledge databases used and maintained by biological experts, students, and general users. The philosophy behind BioCC is that the formation of biological knowledge is best achieved by open-minded individuals freely exchanging information. In the near future, the amount of genomic information will have flooded society. BioGG can be an effective and quickly updated knowledge database system. BioCC uses an opensource wiki system called Mediawiki. However, for easier editing, a modified version of Mediawiki, called Biowiki, has been applied. Unlike Mediawiki, Biowiki uses a WYSIWYG (What You See Is What You Get) text editor. BioCC is under a share-alike license called BioLicense (http://biolicense.org). The BioCC top level site is found at http://bio.cc/
Anonyms and Pseudonyms
;
Biology*
;
Computational Biology
;
Humans
;
Hypermedia*
;
Information Services
;
Internet
;
Licensure
;
Linear Energy Transfer
;
Philosophy