1.A Simple and Fast Web Alignment Tool for Large Amount of Sequence Data.
Genomics & Informatics 2008;6(3):157-159
Multiple sequence alignment (MSA) is the most important step for many of biological sequence analyses, homology search, and protein structural assignments. However, large amount of data make biologists difficult to perform MSA analyses and it requires much computational time to align many sequences. Here, we have developed a simple and fast web alignment tool for aligning, editing, and visualizing large amount of sequence data. We used a cluster server installed ClustalW-MPI using web services and message passing interface (MPI). It also enables users to edit multiple sequence alignments for manual editing and to download the input data and results such as alignments and phylogenetic tree.
Sequence Alignment
;
Sequence Analysis
2.A Simple Java Sequence Alignment Editing Tool for Resolving Complex Repeat Regions.
Seong Il HAM ; Kyung Eun LEE ; Hyun Seok PARK
Genomics & Informatics 2009;7(1):46-48
Finishing is the most time-consuming step in sequencing, and many genome projects are left unfinished due to complex repeat regions. Here, we have developed BACContigEditor, a prototype shotgun sequence finishing tool. It is essentially an editor that visualizes assemblies of shotgun sequence fragment reads as gapped multiple alignments. The program offers some flexibility that is needed to rapidly resolve complex regions within a working session. The sole purpose of the release is to promote collaborative creation of extensible software for fragment assembly editors, foster collaborative development, and reduce barriers to initial tool development effort. We describe our software architecture and identify current challenges. The program is available under an Open Source license.
Genome
;
Indonesia
;
Pliability
;
Sequence Alignment
3.Recent progress in multiple sequence alignment.
Fan YANG ; Dongming TANG ; Yong BAI ; Mingyuan ZHAO ; Qingxin ZHU
Journal of Biomedical Engineering 2010;27(4):924-928
Multiple sequence alignment is one of the basic techniques in bioinformatics, and it plays a vital role in structure modeling, functional site prediction, and phylogenetic analysis. In this paper, we review the methodologies and recent advances in the multiple protein sequence alignment, e.g., speeding up the calculation of distances among sequences and employing the iterative refinement and consistency-based scoring function, with emphasis on the use of additional sequence and structural information for improving alignment quality.
Algorithms
;
Proteins
;
chemistry
;
Sequence Alignment
;
methods
;
Sequence Analysis, Protein
;
methods
4.Alignment-free biomolecular sequence comparison method.
Weijuan FU ; Yuanyuan WANG ; Daru LU
Journal of Biomedical Engineering 2005;22(3):598-605
Biosequence analysis is the primary research field of bioinformatics. In this field, useful information can be extracted by comparison analysis methods. Among them, sequence alignment is the most common comparison method. However the sequence comparison by alignment, which assumes conservation of contiguity between homologous segments, is at odds with genetic recombination. Especially for the multisequence alignment, there exists the difficulty in the complexity of calculation. Therefore, alignment-free sequence comparison methods are required. In this paper, two main categories of alignment-free sequence comparison methods are reviewed. The first one is based on the word (oligomer) frequency and its distribution. The sequences are compared using the distances defined in a Cartesian space by the frequency vectors. In the second category, sequences are compared using Kolmogorov complexity and chaos theory.
Algorithms
;
Computational Biology
;
Sequence Alignment
;
Sequence Analysis
;
methods
5.Enzyme ancestral sequence reconstruction and directed evolution.
Kun ZHANG ; Yifei DAI ; Jindi SUN ; Jiachen LU ; Kequan CHEN
Chinese Journal of Biotechnology 2021;37(12):4187-4200
The amino acid sequence of ancestral enzymes from extinct organisms can be deduced through in silico approach termed ancestral sequence reconstruction (ASR). ASR usually has six steps, which are the collection of nucleic acid/amino acid sequences of modern enzymes, multiple sequence alignment, phylogenetic tree construction, computational deduction of ancestral enzyme sequence, gene cloning, and characterization of enzyme properties. This method is widely used to study the adaptation and evolution mechanism of molecules to the changing environmental conditions on planetary time scale. As enzymes play key roles in biocatalysis, this method has become a powerful method for studying the relationship among the sequence, structure, and function of enzymes. Notably, most of the ancestral enzymes show better temperature stability and mutation stability, making them ideal protein scaffolds for further directed evolution. This article summarizes the computer algorithms, applications, and commonly used computer software of ASR, and discusses the potential application in directed evolution of enzymes.
Amino Acid Sequence
;
Evolution, Molecular
;
Phylogeny
;
Proteins/genetics*
;
Sequence Alignment
6.Identification of a new human leukocyte antigen A allele, HLA-A*3020.
Xiaofeng LI ; Xu ZHANG ; Yang CHEN ; Kunlian ZHANG ; Xianzhi LIU ; Jianping LI
Chinese Journal of Medical Genetics 2010;27(1):96-99
OBJECTIVETo identify a novel human leukocyte antigen (HLA) A allele.
METHODSA new HLA-A allele was found during routine HLA genotyping by polymerase chain reaction-sequence specific oligonucleotide probes (PCR-SSOP) and sequencing-based typing (SBT).
RESULTSThe novel HLA-A*30 allele was identical to A*300101 except that a nucleotide C at position 294 of exon 2 is substituted by A, resulting in codon 98 changed from GAC (D) to GAA (E).
CONCLUSIONA new HLA allele, HLA-A*3020, was identified, and was named officially by the WHO Nomenclature Committee.
Alleles ; Base Sequence ; HLA-A Antigens ; chemistry ; genetics ; Humans ; Molecular Sequence Data ; Sequence Alignment ; Sequence Analysis, DNA
7.An examination of the OMIM database for associating mutation to a consensus reference sequence.
Zuofeng LI ; Beili YING ; Xingnan LIU ; Xiaoyan ZHANG ; Hong YU
Protein & Cell 2012;3(3):198-203
Gene mutation (e.g. substitution, insertion and deletion) and related phenotype information are important biomedical knowledge. Many biomedical databases (e.g. OMIM) incorporate such data. However, few studies have examined the quality of this data. In the current study, we examined the quality of protein single-point mutations in the OMIM and identified whether the corresponding reference sequences align with the mutation positions. Our results show that close to 20% of mutation data cannot be mapped to a single reference sequence. The failed mappings are caused by position conflict, site shifting (peptide, N-terminal methionine) and other types of data error. We propose a preliminary model to resolve such inconsistency in the OMIM database.
Amino Acid Sequence
;
Consensus Sequence
;
Databases, Genetic
;
Molecular Sequence Data
;
Point Mutation
;
Sequence Alignment
8.BioCovi: A Visualization Service for Comparative Genomics Analysis.
Jungsul LEE ; Daeui PARK ; Jong BHAK
Genomics & Informatics 2005;3(2):52-54
Visualization of the homology information is an important method to analyze the evolutionary and functional meanings of genes. With a database containing model genomes of Homo sapiens, Mus muculus, and Rattus norvegicus, we constructed a web-based comparative analysis tool, BioCovi, to visualize the homology information of mammalian sequences on a very large scale. The user interface has several features: it marks regions whose identity is greater than that specified, it shows or hides gaps from the result of global sequence alignment, and it inverts the graph when total identity is higher than the threshold specified.
Animals
;
Genome
;
Genomics*
;
Humans
;
Mice
;
Rats
;
Sequence Alignment
9.FASMA: a service to format and analyze sequences in multiple alignments.
Susan COSTANTINI ; Giovanni COLONNA ; Angelo M FACCHIANO
Genomics, Proteomics & Bioinformatics 2007;5(3-4):253-255
Multiple sequence alignments are successfully applied in many studies for under- standing the structural and functional relations among single nucleic acids and protein sequences as well as whole families. Because of the rapid growth of sequence databases, multiple sequence alignments can often be very large and difficult to visualize and analyze. We offer a new service aimed to visualize and analyze the multiple alignments obtained with different external algorithms, with new features useful for the comparison of the aligned sequences as well as for the creation of a final image of the alignment. The service is named FASMA and is available at http://bioinformatica.isa.cnr.it/FASMA/.
Algorithms
;
Computational Biology
;
Internet
;
Sequence Alignment
;
statistics & numerical data
;
Software
10.Identification and characterization of DIR gene family in Schisandra chinensis.
Yu-Qing DONG ; Ting-Yan QIANG ; Jiu-Shi LIU ; Bin LI ; Xue-Ping WEI ; Yao-Dong QI ; Hai-Tao LIU ; Ben-Gang ZHANG
China Journal of Chinese Materia Medica 2021;46(20):5270-5277
Dirigent(DIR) proteins are involved in the biosynthesis of lignin, lignans, and gossypol in plants and respond to biotic and abiotic stresses. Based on the full-length transcriptome of Schisandra chinensis, bioinformatics methods were used to preliminarily identify the DIR gene family and analyze the physico-chemical properties, subcellular localization, conserved motifs, phylogeny, and expression patterns of the proteins. The results showed that a total of 34 DIR genes were screened and the encoded proteins were 156-387 aa. The physico-chemical properties of the proteins were different and the secondary structure was mainly random coil. Half of the DIR proteins were located in chloroplast, while the others in extracellular region, endoplasmic reticulum, cytoplasm, etc. Phylogenetic analysis of DIR proteins from S. chinensis and the other 8 species such as Arabidopsis thaliana, Oryza sativa, and Glycine max demonstrated that all DIR proteins were clustered into 5 subfamilies and that DIR proteins from S. chinensis were in 4 subfamilies. DIR-a subfamily has the unique structure of 8 β-sheets, as verified by multiple sequence alignment. Finally, through the analysis of the transcriptome of S. chinensis fruit at different development stages, the expression pattern of DIR was clarified. Combined with the accumulation of lignans in fruits at different stages, DIR might be related to the synthesis of lignans in S. chinensis. This study lays a theoretical basis for exploring the biological functions of DIR genes and elucidating the biosynthesis pathway of lignans in S. chinensis.
Fruit/genetics*
;
Lignans/analysis*
;
Phylogeny
;
Schisandra
;
Sequence Alignment