1.misMM: An Integrated Pipeline for Misassembly Detection Using Genotyping-by-Sequencing and Its Validation with BAC End Library Sequences and Gene Synteny.
Young Joon KO ; Jung Sun KIM ; Sangsoo KIM
Genomics & Informatics 2017;15(4):128-135
As next-generation sequencing technologies have advanced, enormous amounts of whole-genome sequence information in various species have been released. However, it is still difficult to assemble the whole genome precisely, due to inherent limitations of short-read sequencing technologies. In particular, the complexities of plants are incomparable to those of microorganisms or animals because of whole-genome duplications, repeat insertions, and Numt insertions, etc. In this study, we describe a new method for detecting misassembly sequence regions of Brassica rapa with genotyping-by-sequencing, followed by MadMapper clustering. The misassembly candidate regions were cross-checked with BAC clone paired-ends library sequences that have been mapped to the reference genome. The results were further verified with gene synteny relations between Brassica rapa and Arabidopsis thaliana. We conclude that this method will help detect misassembly regions and be applicable to incompletely assembled reference genomes from a variety of species.
Animals
;
Arabidopsis
;
Brassica rapa
;
Clone Cells
;
Genome
;
Methods
;
Synteny*
2.Non-Synteny Regions in the Human Genome.
Genomics & Informatics 2010;8(2):86-89
Closely related species share large genomic segments called syntenic regions, where the genomic elements such as genes are arranged co-linearly among the species. While synteny is an important criteria in establishing orthologous regions between species, non-syntenic regions may display species-specific features. As the first step in cataloging human- or primate-specific genomic elements, we surveyed human genomic regions that are not syntenic with any other non-primate mammalian genomes sequenced so far. Based on the data compiled in Ensembl databases, we were able to identify 10 such regions located in eight different human chromosomes. Interestingly, most of these highly human- or primate-specific loci are concentrated in subtelomeric or pericentromeric regions. It has been reported that subtelomeric regions in human chromosomes are highly plastic and filled with recently shuffled genomic elements. Pericentromeric regions also show a great deal of segmental duplications. Such genomic rearrangements may have caused these large human- or primate-specific genome segments.
Cataloging
;
Chromosomes, Human
;
Genome
;
Genome, Human
;
Humans
;
Plastics
;
Resin Cements
;
Segmental Duplications, Genomic
;
Synteny