List of alignment visualization software
Encyclopedia
This page is a subsection of List of sequence alignment software.
Multiple alignment visualization tools typically serve four purposes:
The rest of this article is focused on just multiple global alignments of homologous proteins. The first two are a natural consequence of the fact that most computational representations of alignments and their annotation are not human readable and best portrayed in the familiar sequence row and alignment column format, of which examples are widespread in the literature. The third is a necessity because both Multiple sequence alignment
and Structural alignment
algorithms utilise heuristics which do not always perform perfectly. The fourth is a great example of how interactive graphical tools enable a worker involved in sequence analysis to conveniently execute a variety if different computational tools in order to explore an alignment's phylogenetic implications; or, to predict the structure and functional properties of a specific sequence (e.g. comparative modelling).
Some useful discussions on sequence alignment editors/viewers can be found here:
Multiple alignment visualization tools typically serve four purposes:
- General comprehension of large-scale DNA or protein alignments
- Visualization of alignments for figures and publication.
- Manual editing and curation of automatically generated alignments.
- In depth analysis
The rest of this article is focused on just multiple global alignments of homologous proteins. The first two are a natural consequence of the fact that most computational representations of alignments and their annotation are not human readable and best portrayed in the familiar sequence row and alignment column format, of which examples are widespread in the literature. The third is a necessity because both Multiple sequence alignment
Multiple sequence alignment
A multiple sequence alignment is a sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. In many cases, the input set of query sequences are assumed to have an evolutionary relationship by which they share a lineage and are descended from a common ancestor...
and Structural alignment
Structural alignment
Structural alignment attempts to establish homology between two or more polymer structures based on their shape and three-dimensional conformation. This process is usually applied to protein tertiary structures but can also be used for large RNA molecules...
algorithms utilise heuristics which do not always perform perfectly. The fourth is a great example of how interactive graphical tools enable a worker involved in sequence analysis to conveniently execute a variety if different computational tools in order to explore an alignment's phylogenetic implications; or, to predict the structure and functional properties of a specific sequence (e.g. comparative modelling).
Alignment viewers/editors
Name | Integrated with Struct. Prediction Tools | Can Align Sequences | Can Calculate Phylogenetic Trees | Other Features | Formats Supported | License | Link |
---|---|---|---|---|---|---|---|
Ale (emacs plugin) | No | Yes | No | No | GenBank, EMBL, Fast-A, and Phylip | GPL | link |
BioEdit | No | ClustalW | rudimentary, can read phylip | plasmid drawing, ABI chromatograms, | Genbank, Fasta, Phylip 3.2, Phylip 4, NBRF/PIR | Free | link |
CINEMA | NO, but can read/show 2D structure annotations | ClustalW | No | Dotplot, 6 frame translation, Blast | Nexus Nexus file Nexus file format is widely used in Bioinformatics. Several popular phylogenetic programs such as Paup*, MrBayes, Mesquite, and MacClade use this format.- Syntax :Command inside square brackets [ and ] are ignored... , MSF, Clustal Clustal Clustal is a widely used multiple sequence alignment computer program. The latest version is 2.1. There are two main variations:*ClustalW: command line interface*ClustalX: This version has a graphical user interface... , FASTA FASTA format In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences... , PHYLIP, PIR, PRINTS |
Free | link |
CLC viewer (Free version) | only in commercial version | Clustal, Muscle, T-Coffee, MAFFT, kalign, various | UPGMA, NJ | workflows, blast/genbank search | many | Freeware. More options available in commercial versions. | link, table of features |
ClustalX viewer | No | Clustalw | Neighbor-joining Neighbor-joining In bioinformatics, neighbor joining is a bottom-up clustering method for the creation of phenetic trees , created by Naruya Saitou and Masatoshi Nei... |
Alignment quality analysis | Nexus Nexus file Nexus file format is widely used in Bioinformatics. Several popular phylogenetic programs such as Paup*, MrBayes, Mesquite, and MacClade use this format.- Syntax :Command inside square brackets [ and ] are ignored... , MSF, Clustal Clustal Clustal is a widely used multiple sequence alignment computer program. The latest version is 2.1. There are two main variations:*ClustalW: command line interface*ClustalX: This version has a graphical user interface... , FASTA FASTA format In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences... , PHYLIP |
Free for academic users | link |
Cylindrical BLAST Viewer | No | No | No | 3D, Animation, Drilldown, Legend Selection | BLAST XML, proprietary XML, GFF3, ClustalW, INSDSet, user expandable with XSLT XSLT XSLT is a declarative, XML-based language used for the transformation of XML documents. The original document is not changed; rather, a new document is created based on the content of an existing one. The new document may be serialized by the processor in standard XML syntax or in another format,... |
GPL | link |
DnaSP | can compute several population genetics statistics, reconstruct haplotypes with PHASE | FASTA FASTA FASTA is a DNA and protein sequence alignment software package first described by David J. Lipman and William R. Pearson in 1985. Its legacy is the FASTA format which is now ubiquitous in bioinformatics.- History :... , Nexus Nexus file Nexus file format is widely used in Bioinformatics. Several popular phylogenetic programs such as Paup*, MrBayes, Mesquite, and MacClade use this format.- Syntax :Command inside square brackets [ and ] are ignored... , Mega, PHYLIP PHYLIP PHYLIP is a free computational phylogenetics package of programs for inferring evolutionary trees . The name is an acronym for PHYLogeny Inference Package. It consists of 35 portable programs, i.e... |
Freeware Freeware Freeware is computer software that is available for use at no cost or for an optional fee, but usually with one or more restricted usage rights. Freeware is in contrast to commercial software, which is typically sold for profit, but might be distributed for a business or commercial purpose in the... |
link | |||
emacs - biomode | link | ||||||
Genedoc | No, but can read/show annotations | Pairwise | No, but can read/show annotations | gel simulation, stats, multiple views, simple | many | Free | link table of features |
Geneious Pro Geneious Geneious is suite of cross-platform bioinformatics software applications developed by Biomatters Ltd.- Features :Geneious comes in a Basic version that is free for academic use, and a commercial Pro version with added features. Geneious bundles various bioinformatics tools under one hood with an... |
Yes - powered by EMBOSS EMBOSS EMBOSS is an acronym for European Molecular Biology Open Software Suite. EMBOSS is a free Open Source software analysis package specially developed for the needs of the molecular biology and bioinformatics user community... tools |
Clustal, Muscle, MAUVE, profile, translation | UPGMA, NJ, PhyML, MrBayes plugin, PAUP* plugin | Whole genome assembly, restriction analysis, cloning, primer design, dotplot and much more | >40 file formats imported and exported | Geneious Basic (freeware) Geneious Pro (commercial with student and academic discounts) | link |
Integrated Genome Browser Integrated Genome Browser Integrated Genome Browser is an open source genome browser, a visualization tool used to observe biologically-interesting patterns in genomic data sets, including sequence data, gene models, alignments, and data from DNA microarrays.- History :... (IGB) |
No | No | No | sequences and features from files, URLs, and arbitrary DAS and QuickLoad servers | BAM, FASTA FASTA format In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences... , PSL |
CPL Common Public License In computing, the CPL is a free software / open-source software license published by IBM. The Free Software Foundation and Open Source Initiative have approved the license terms of the CPL.... |
link |
Jalview 2 | Secondary Struct. Prediction via JNET | Clustal, Muscle, MAFFT, Probcons, TCoffee via web services | UPGMA, NJ | sequences and features from arbitrary and publicly registered DAS servers, PFAM, PDB, EMBL and Uniprot Accession retrieval. | FASTA FASTA format In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences... , PFAM, MSF, Clustal Clustal Clustal is a widely used multiple sequence alignment computer program. The latest version is 2.1. There are two main variations:*ClustalW: command line interface*ClustalX: This version has a graphical user interface... , BLC, PIR, Stockholm Stockholm format Stockholm format is a Multiple sequence alignment format used by Pfam and Rfam to disseminate protein and RNA sequence alignments. The alignment editors... |
GPL | link |
MEGA | No | Native ClustalW | UPGMA, NJ, ME, MP, with bootstrap and confidence test | extended support to phylogenetics analysis | FASTA, Clustal, Nexus, Mega, etc.. | Freeware, registration requested | link, table of features |
Multiseq (vmd Visual Molecular Dynamics - External links :* * *... plugin) |
No, but can display and align 3D structures | ClustaLW, MaFFT, Stamp (Strutural) | Percent identity, Clustal, MaFFT, Structural | Scripting via Tcl Tcl Tcl is a scripting language created by John Ousterhout. Originally "born out of frustration", according to the author, with programmers devising their own languages intended to be embedded into applications, Tcl gained acceptance on its own... , mapping from sequence to 3D structure |
FASTA, PDB, ALN, PHYLYP, NEXUS | Free, but VMD is free only for noncommercial use | link |
MView | No | No | No | stacked alignments from blast and fasta suites, various MSA format conversions, HTML markup, consensus patterns | BLAST search, FASTA search, Clustal, HSSP, FASTA, PIR, MSF | GPL | link |
PFAAT | NO, but can display 3D structures | ClustalW | Neighbor-joining Neighbor-joining In bioinformatics, neighbor joining is a bottom-up clustering method for the creation of phenetic trees , created by Naruya Saitou and Masatoshi Nei... |
Manual annotation, conservation scores | Nexus Nexus file Nexus file format is widely used in Bioinformatics. Several popular phylogenetic programs such as Paup*, MrBayes, Mesquite, and MacClade use this format.- Syntax :Command inside square brackets [ and ] are ignored... , MSF, Clustal Clustal Clustal is a widely used multiple sequence alignment computer program. The latest version is 2.1. There are two main variations:*ClustalW: command line interface*ClustalX: This version has a graphical user interface... , FASTA FASTA format In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences... , PFAAT |
Free | link |
Ralee (emacs plugin for RNA al. editing) | RNA structure | Stockholm Stockholm format Stockholm format is a Multiple sequence alignment format used by Pfam and Rfam to disseminate protein and RNA sequence alignments. The alignment editors... |
GPL | link | |||
S2S RNA editor | 2D structure | Rnalign | No | base-base interactions,2D,3D viewer | FASTA FASTA format In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences... , RnaML |
Free | link |
Seaview | No | local Muscle/Clustalw | Parcimony, distance methods, PhyML | Dot-plot, vim-like editing keys | Nexus Nexus file Nexus file format is widely used in Bioinformatics. Several popular phylogenetic programs such as Paup*, MrBayes, Mesquite, and MacClade use this format.- Syntax :Command inside square brackets [ and ] are ignored... , MSF, Clustal Clustal Clustal is a widely used multiple sequence alignment computer program. The latest version is 2.1. There are two main variations:*ClustalW: command line interface*ClustalX: This version has a graphical user interface... , FASTA FASTA format In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences... , PHYLIP, MASE |
link | |
Sequilab | Yes | Yes | No | Link alignment results to analysis tools (Primer design, Gel mobility and Maps, Plasmapper, siRNA design Epitope prediction), Save research logs, Create custom toolbars | Accession number, GI number, PDB ID, FASTA FASTA FASTA is a DNA and protein sequence alignment software package first described by David J. Lipman and William R. Pearson in 1985. Its legacy is the FASTA format which is now ubiquitous in bioinformatics.- History :... , DragNdrop from external URL from within the user interface |
Freeware Freeware Freeware is computer software that is available for use at no cost or for an optional fee, but usually with one or more restricted usage rights. Freeware is in contrast to commercial software, which is typically sold for profit, but might be distributed for a business or commercial purpose in the... |
link |
SeqPop | No | Free | link | ||||
Strap | Jnet, NNPREDICT, Coiled coil, 16 different TM-helix | Fifteen different Methods | Neighbor-joining Neighbor-joining In bioinformatics, neighbor joining is a bottom-up clustering method for the creation of phenetic trees , created by Naruya Saitou and Masatoshi Nei... |
Dot-plot, Structure-neighbors, 3D-superposition, Blast-search, Mutation/SNP analysis, Sequence features, Biojava BioJava The BioJava Project is an open source project dedicated to providing Java tools for processing biological data. This includes include objects for manipulating sequences, protein structures, file parsers, CORBA interoperability, DAS, access to AceDB, dynamic programming, and simple statistical... -interface |
MSF MSF MSF may refer to:* Mail Summary File , file extension used by Earthlink, Mozilla Thunderbird, and Netscape mail clients to store folder data in Mork.* Marvel Super Heroes vs... , Stockholm Stockholm format Stockholm format is a Multiple sequence alignment format used by Pfam and Rfam to disseminate protein and RNA sequence alignments. The alignment editors... , Clustal Clustal Clustal is a widely used multiple sequence alignment computer program. The latest version is 2.1. There are two main variations:*ClustalW: command line interface*ClustalX: This version has a graphical user interface... w, Nexus Nexus file Nexus file format is widely used in Bioinformatics. Several popular phylogenetic programs such as Paup*, MrBayes, Mesquite, and MacClade use this format.- Syntax :Command inside square brackets [ and ] are ignored... , FASTA FASTA FASTA is a DNA and protein sequence alignment software package first described by David J. Lipman and William R. Pearson in 1985. Its legacy is the FASTA format which is now ubiquitous in bioinformatics.- History :... , PDB Protein Data Bank (file format) The Protein Data Bank file format is a textual file format describing the three dimensional structures of molecules held in the Protein Data Bank. The pdb format accordingly provides for description and annotation of protein and nucleic acid structures including atomic coordinates, observed... , Embl, GenBank GenBank The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. This database is produced and maintained by the National Center for Biotechnology Information as part of the International Nucleotide Sequence... , hssp HSSP HSSP may refer to:* Homology-derived Secondary Structure of Proteins, a protein database* Healthcare Services Specification Project, a joint initiative of the Object Management Group and Health Level 7* Port Sudan Military Airport, ICAO airport code HSSP... , Pfam Pfam Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models.- Features :For each family in Pfam one can:* Look at multiple alignments* View protein domain architectures... |
GPL | link |
UGENE UGENE UGENE is free open-source cross-platform bioinformatics software.It integrates dozens of well-known biological tools and algorithms, providing both graphical user and command line interfaces... |
Yes | MUSCLE, KAlign | Yes | many | FASTA FASTA format In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences... , FASTQ FASTQ format FASTQ format is a text-based format for storing both a biological sequence and its corresponding quality scores. Both the sequence letter and quality score are encoded with a single ASCII character for brevity... , GenBank GenBank The GenBank sequence database is an open access, annotated collection of all publicly available nucleotide sequences and their protein translations. This database is produced and maintained by the National Center for Biotechnology Information as part of the International Nucleotide Sequence... , EMBL, ABIF, SCF, CLUSTALW Clustal Clustal is a widely used multiple sequence alignment computer program. The latest version is 2.1. There are two main variations:*ClustalW: command line interface*ClustalX: This version has a graphical user interface... , Stockholm Stockholm format Stockholm format is a Multiple sequence alignment format used by Pfam and Rfam to disseminate protein and RNA sequence alignments. The alignment editors... , Newick, PDB Protein Data Bank The Protein Data Bank is a repository for the 3-D structural data of large biological molecules, such as proteins and nucleic acids.... , MSF, GFF |
GPL | link |
VISSA sequence/structure viewer | DSSP secondary structure | ClustalX | No | Mapping from sequence to 3D structure | Clustal Clustal Clustal is a widely used multiple sequence alignment computer program. The latest version is 2.1. There are two main variations:*ClustalW: command line interface*ClustalX: This version has a graphical user interface... , FASTA FASTA format In bioinformatics, FASTA format is a text-based format for representing either nucleotide sequences or peptide sequences, in which nucleotides or amino acids are represented using single-letter codes. The format also allows for sequence names and comments to precede the sequences... |
Free | link |
Some useful discussions on sequence alignment editors/viewers can be found here:
- http://lists.open-bio.org/pipermail/emboss/2008-July/003324.html
See also
- Sequence alignment softwareSequence alignment softwareThis list of sequence alignment software is a compilation of software tools and web portals used in pairwise sequence alignment and multiple sequence alignment...
- Biological data visualizationBiological data visualizationBiology Data Visualization is a branch of bioinformatics concerned with the application of computer graphics, scientific visualization, and information visualization to different areas of the life sciences. This includes visualization of sequences, genomes, alignments, phylogenies, macromolecular...