PIntron: a fast method for gene structure prediction via maximal pairings of a pattern and a text

PIntron: a fast method for gene structure prediction via maximal pairings of a pattern and a text, Yuri Pirola, ICCABS 2011 (slides).

In this work, we propose a novel pipeline for computational gene-structure prediction based on spliced alignment of expressed sequences (ESTs and mRNAs). This pipeline, called PIntron, is composed by four steps: Firstly, alternative alignments of expressed sequences to a reference genomic sequence are implicitly computed and represented in a graph (called embedding graph) by a novel fast spliced alignment procedure. Secondly, biologically meaningful alignments are extracted. Then, a consensus gene structure induced by the previously computed alignments is determined based on a parsimony principle. Finally, the resulting introns are reconciliated and classified according to general biological criteria. The software, released under the GNU Affero General Public License, can be freely downloaded from this page.