CoCoNUT
Computational Comparative GeNomics Utilities Toolkit
This is the web-site of
CoCoNUT,
software tool for versatile comparative genomics tasks. In the user
manual and tutorial,
there are many examples and a tutorial on
how to use CoCoNUT for efficient
genome comparison.
Here we outline the most important features
of CoCoNUT.
CoCoNUT is a software tool
for performing the following
comparative genomics tasks:
- Global alignment for two or multiple whole genomes.
- Finding regions of high similarity (candidate regions of conserved
synteny) among two or multiple genomes.
- Comparison of a draft genomes (a draft genome is not a single string but
is a set of strings called contigs) to finished or to other draft
genome (the current version is limited to at most 2 draft genomes).
- cDNA/EST mapping.
- Repeat analysis and detection of large segmental duplications.
CoCoNUT is based on the anchor-based strategy that is composed of three
phases:
- Computation of fragments.
- Computation of highest-scoring chains of colinear fragments. The fragments
of these chains compose the set of (the anchors).
- Alignment of the regions between the fragments of a chain
by applying the same method recursively with less stringent
parameters or by using standard dynamic programming.
This strategy can be used for solving the aforementioned tasks.
For example, if genomes of closely-related organisms are compared,
where there are no (or few) genome rearrangements, then in the second
phase CoCoNUT can be used to
compute an optimal global chain of colinear non-overlapping
fragments.
If genomes of distantly-related organisms are compared,
where rearrangement events are very likely to take place,
then CoCoNUT can be used for computing
a set of significant local chains.
Each local chain represents a region
of high similarity among the genomes in comparison.
It is interesting to see that CoCoNUT
extends the program MGA
in computing local alignments, and its ability to handle forward and reverse
strands.
Moreover, CoCoNUT includes the following post-processing capabilities:
- Interactive visualization of comparison results using a Java-based program called VisCHAINER.
- Detection of syntenic regions and reporting these sets as permutations
for studying genome rearrangements.
- Clustering cDNAs for detecting alternative splices and repeated genes.
- Assembling draft genomes, by comparison to finished genomes (under development).
For more details, see the user manual and tutorial.
CoCoNUT is based on algorithms and data structures
optimized to handle large datasets:
It uses the Vmatch package, which
is based on the enhanced suffix array, for generating the fragments.
It uses the program CHAINER, which
is based on techniques from computational geometry, for computing chains
specific to the comparative genomics task at hand.
Other programs are implemented to post-process the resulting chains. This post-processing depends on the task carried out, and it includes, among others,
computing alignment on the nucleotide level, finding syntenic regions, and visualizing the results.
CoCoNUT is free for
academic research, educational and demonstration purposes.
- Download CoCoNUT:
Please send the The CoCoNUT-license agreement
to the author in order to obtain the download link.
Note that you need to have a license agreement for Vmatch also (in case you do not have it) to obtain its binaries.
- Download the Visualization Tool ViCHAINER:
VisCHAINER Webpage
For commercial license, please directly contact the author.
CoCoNUT is available for the following platforms:
- Linux (Redhat, SuSe, Debian) for Intel and AMD architectures
- Apple MAC OS.
The standard version of CoCoNUT is compiled in 32-bit mode.
For large server class machines (e.g., SUN-Sparc/Solaris)
CoCoNUT can be compiled in 64-bit mode.
If you need CoCoNUT for an additional platform,
please contact the author.
Please see the user manual and tutorial for details.
Here, you can download
the test data (size is about 36 Mb)
needed to run the
examples of the tutorial.
- Mohamed Ibrahim Abouelhoda,
Previousely in Dept. of Bioinformatics,
University of Ulm,
Germany.
The CoCoNUT-manual
The CoCoNUT-license agreement form
Fragment generation tool:
We recommend to use
the Vmatch
and the Multimat/Ramaco
program.
However, CoCoNUT can use any kind of fragments as long as they
are given in the correct input format; this require you re-edit some lines in
the program.
Perl
Gnuplot: (optional) for producing postscript images of the comparison results.
Java: (optional) for running the interactive visualization tool VisCHAINER.
Please, send your comments and suggestions to the authors.
CoCoNUT is part of the
DFG-Projekt:
Entwicklung eines Software-Systems zum multiplen
Genomvergleich supported by the DFG-grant Oh 54/4-1.
My thanks to Enno Ohlebusch, Stefan Kurtz , Janina Reeder, and Kathrin Hockel,
for their help and useful
suggestions.
- Mohamed I. Abouelhoda, Stefan Kurtz,
Enno Ohlebusch
CoCoNUT: an efficient system for the comparison and analysis of
genomes
BMC Bioinformatics, 9:476, 2008.
- Mohamed Ibrahim Abouelhoda, Enno Ohlebusch
CHAINER: Software for Comparing Genomes.
In 12th International Conference on Intelligent Systems for Molecular Biology/3rd European Conference on Computational Biology.
- Mohamed Ibrahim Abouelhoda, Enno Ohlebusch
Chaining Algorithms for Multiple Genome Comparison
Journal of Discrete Algorithms, to appear.
- Mohamed Ibrahim Abouelhoda, Enno Ohlebusch
A Local Chaining Algorithm and its Applications in Comparative
Genomics
Proceedings of the 3rd Workshop on Algorithms in Bioinformatics,
pages 1-16, LNBI 2812 , 2003.
� Springer-Verlag
- Mohamed Ibrahim Abouelhoda, Stefan Kurtz,
Enno Ohlebusch
Replacing Suffix Trees with Enhanced Suffix Arrays
Journal of Discrete Algorithms, 2(1):53-86, 2004.