The investigation of genetic differences among humans has given evidence that mutations in DNA sequences are responsible for some genetic diseases. The most common mutation is the one that involves only a single nucleotide of the DNA sequence, which is called a single nucleotide polymorphism (SNP). As a consequence, computing a complete map of all SNPs occurring in the human populations is one of the primary goals of recent studies in human genomics.
Our research in this field is mainly focused on the design and experimentation of algorithm for solving combinatorial problems related to haplotype inference and genetic variations analysis.
Specific computational problems of interest are: (1) genotype imputation and haplotype reconstruction in pedigrees on real data (human and farm animals) (2) haplotype phasing and genotype analysis assuming the Coalescent model of the perfect phylogeny describing the evolutionary history of SNPs (single nucleotide polymorphism) data in presence of recurrent mutations.
The haplotype assembly is the problem that aims at reconstructing the two distinct copies of each chromosome, called haplotypes, starting from a collection of sequencing reads that are aligned through a reference genome.
Since the current state-of-the-art approaches failed to fully exploit the novel characteristics of future-generation sequencing technologies, our goal is the modelling of new combinatorial formulations of the problem and the design of algorithms that allow to overcome the limits of current state-of-the-art approaches, allowing to phase larger datasets in order to increase the accuracy and without restrective assumption.
Here there is the link to the page explaining out project and work.
Pirola, Y., Zaccaria, S., Dondi, R., Klau, G. W., Pisanti, N., & Bonizzoni, P. (2015). HapCol: accurate and memory-efficient haplotype assembly from long reads. Bioinformatics, 32(11), 1610-1617.