Search Results

1.Analysis of unmapped regions associated with long deletions in Korean whole genome sequences based on short read data

Genomics & Informatics 2019;17(4):e40-

2.Analysis of unmapped regions associated with long deletions in Korean whole genome sequences based on short read data

Genomics & Informatics 2019;17(4):40-

While studies aimed at detecting and analyzing indels or single nucleotide polymorphisms within human genomic sequences have been actively conducted, studies on detecting long insertions/deletions are not easy to orchestrate. For the last 10 years, the availability of long read data of human genomes from PacBio or Nanopore platforms has increased, which makes it easier to detect long insertions/deletions. However, because long read data have a critical disadvantage due to their relatively high cost, many next generation sequencing data are produced mainly by short read sequencing machines. Here, we constructed programs to detect so-called unmapped regions (UMRs, where no reads are mapped on the reference genome), scanned 40 Korean genomes to select UMR long deletion candidates, and compared the candidates with the long deletion break points within the genomes available from the 1000 Genomes Project (1KGP). An average of about 36,000 UMRs were found in the 40 Korean genomes tested, 284 UMRs were common across the 40 genomes, and a total of 37,943 UMRs were found. Compared with the 74,045 break points provided by the 1KGP, 30,698 UMRs overlapped. As the number of compared samples increased from 1 to 40, the number of UMRs that overlapped with the break points also increased. This eventually reached a peak of 80.9% of the total UMRs found in this study. As the total number of overlapped UMRs could probably grow to encompass 74,045 break points with the inclusion of more Korean genomes, this approach could be practically useful for studies on long deletions utilizing short read data.
Genome ; Genome, Human ; Humans ; Nanopores ; Polymorphism, Single Nucleotide

3.Identification of Ethnically Specific Genetic Variations in Pan-Asian Ethnos.

Jin Ok YANG ; Sohyun HWANG ; Woo Yeon KIM ; Seong Jin PARK ; Sang Cheol KIM ; Kiejung PARK ; Byungwook LEE

Genomics & Informatics 2014;12(1):42-47

Asian populations contain a variety of ethnic groups that have ethnically specific genetic differences. Ethnic variants may be highly relevant in disease and human differentiation studies. Here, we identified ethnically specific variants and then investigated their distribution across Asian ethnic groups. We obtained 58,960 Pan-Asian single nucleotide polymorphisms of 1,953 individuals from 72 ethnic groups of 11 Asian countries. We selected 9,306 ethnic variant single nucleotide polymorphisms (ESNPs) and 5,167 ethnic variant copy number polymorphisms (ECNPs) using the nearest shrunken centroid method. We analyzed ESNPs and ECNPs in 3 hierarchical levels: superpopulation, subpopulation, and ethnic population. We also identified ESNP- and ECNP-related genes and their features. This study represents the first attempt to identify Asian ESNP and ECNP markers, which can be used to identify genetic differences and predict disease susceptibility and drug effectiveness in Asian ethnic populations.
Asian Continental Ancestry Group ; Classification ; Disease Susceptibility ; DNA Copy Number Variations ; Ethnic Groups ; Genetic Variation* ; Genotype ; Humans ; Polymorphism, Single Nucleotide

4.ManBIF: a Program for Mining and Managing Biobank Impact Factor Data.

Ki Jin YU ; Jungmin NAM ; Yun HER ; Minseock CHU ; Hyungseok SEO ; Junwoo KIM ; Jaepil JEON ; Hyekyung PARK ; Kiejung PARK

Genomics & Informatics 2011;9(1):37-38

5.PromoterWizard: An Integrated Promoter Prediction Program Using Hybrid Methods.

Kiejung PARK ; Ki Bong KIM

Genomics & Informatics 2011;9(4):194-196

Promoter prediction is a very important problem and is closely related to the main problems of bioinformatics such as the construction of gene regulatory networks and gene function annotation. In this context, we developed an integrated promoter prediction program using hybrid methods, PromoterWizard, which can be employed to detect the core promoter region and the transcription start site (TSS) in vertebrate genomic DNA sequences, an issue of obvious importance for genome annotation efforts. PromoterWizard consists of three main modules and two auxiliary modules. The three main modules include CDRM (Composite Dependency Reflecting Model) module, SVM (Support Vector Machine) module, and ICM (Interpolated Context Model) module. The two auxiliary modules are CpG Island Detector and GCPlot that may contribute to improving the predictive accuracy of the three main modules and facilitating human curator to decide on the final annotation.
Base Sequence ; Chimera ; Computational Biology ; CpG Islands ; Dependency (Psychology) ; Gene Regulatory Networks ; Genome ; Humans ; Promoter Regions, Genetic ; Transcription Initiation Site ; Vertebrates

1.Analysis of unmapped regions associated with long deletions in Korean whole genome sequences based on short read data

2.Analysis of unmapped regions associated with long deletions in Korean whole genome sequences based on short read data

3.Identification of Ethnically Specific Genetic Variations in Pan-Asian Ethnos.

4.ManBIF: a Program for Mining and Managing Biobank Impact Factor Data.

5.PromoterWizard: An Integrated Promoter Prediction Program Using Hybrid Methods.

6.COCAW: A Genome-wide Pattern Search System for Designing Microbial Probes.

7.WinBioDBs: A Windows-based Integrated Program for Manipulating Major Biological Databases.

8.BioStore: A Repository System for Registering and Distributing Public Biology Databases.

9.Computational Approach for Biosynthetic Engineering of Post-PKS Tailoring Enzymes.

10.Computational Approach for the Analysis of Post-PKS Glycosylation Step.

Display Mode

Output Records

File Type