Pash Comparison Method
Pash (Kalafus KJ, Jackson AR, and Milosavljevic A (2004). Pash: Efficient
Genome-Scale Sequence Anchoring by Positional Hashing.
Genome Research 14: 672-678; Coarfa, C. and Milosavljevic,A (2008).
Pash 2.0: Scaleable Sequence Anchoring for Next-Generation Sequencing Technologies.
Pacific Symposium on Biocomputing 13:102-113) implements a novel approach to the
large-scale comparison problem. The aim of this project is to quickly and sensitively compare entire genomes,
or large databases of sequence reads even using modest hardware resources.
Pash is currently developed by Cristian Coarfa .
New Pash Release
A new version of Pash was released ! Several new features are available:
- Multidiagonal collation of kmers, for accurate anchoring across indels
- Postprocessing of read mappings, with an optional, user-directed alignment step
- Speed improvements up to a factor of 3 over Pash 1.2 due to more
efficient collation algorithms and more effective
hashing and inversion of matching kmers
Genomes Compared
With the completion of the Rat draft genome and the mature drafts of the Mouse and
Human genomes, we decided to perform genome-scale similarity comparisons between these species
using our Pash method. Output of Pash was post-processed to retain only the most
extensive and significant orthologous hits. Post-processing involved a reciprocal-best-match
criterion and merging of collinear hits. All similarity annotations are symmetrical across compared
genomes.
Pash was used for pair-wise comparison between three (3) genomes:
Visualizations of the results
are available in a VGP Gallery within Genboree.
Pash Performance
The Human/Rat, Mouse/Rat and Human/Mouse comparisons were each completed in 4 days using 6
computers in a Linux farm using a non-optimized prototype implementation of Pash.
Obtaining Pash
For information on obtaining the Pash source code and support tools, please refer to the
license and download page.
|
|