PhD position available

The University of Milano-Bicocca (https://en.unimib.it/) is a young, multidisciplinary university active in various fields: economics and statistics, legal, scientific, medical, sociological, psychological, and pedagogical. It is an innovative university which has created an extensive network of collaborations that includes many world-famous universities, research centers and top corporations.

In the Times Higher Education rankings 2015 dedicated to the best hundred universities with less than 50 years, the University was ranked 24th in the world, 1st in Italy.

The Department of Informatics, Systems and Communications (DISCo) is a leading Computer Science research and teaching unit in Italy. The Bioinformatics and Experimental Algorithmics (BIAS) research lab invites applications for a Doctoral Student in Data Structures and Algorithms for Graph pangenomes, under the supervision of Professor Paola Bonizzoni.

The position is funded by the Innovative Training Network (ITN) project “Algorithms for PAngenome Computational Analysis (ALPACA)” of the Horizon 2020 Marie Skłodowska-Curie Actions (MSCA) Work Programme. The candidate will join the BIAS team of Professor Paola Bonizzoni who is currently leading another EU-funded international project (PANGAIA) on Data Structures and Algorithms for Graph Pangenomes.  

Representations for the comparative and hierarchical analysis of pan-genomes

The representation of multiple genomes in a graph pangenome is a computational problem that has been faced by indexing paths via compact data structures, such as the FM-index, positional BWT and the graph BWT. In this framework, some important questions are still unsolved and require the development of fast and efficient algorithmic approaches, including querying a graph-based data structure, sequences-to-graph and graph-to-graph comparison, inferring variations (included structural variations) between genomes. A more general question is how to deal with multiple pangenomes, such as those emerging in the context of metagenomics (the study of multiple species in an environment) and transcriptomics (the study of gene expression and transcription).  The main focus of this project is on developing representations of pangenomes that allow fast and space-efficient queries of multiple pangenomes, such as the search for a given substring in the pangenomes (i.e. pattern matching) and the search for approximate matches (i.e. sequence alignment or mapping).

We will investigate the problem under the assumption that the genomes that are encompassed in the pangenome are evolutionarily related, and such relations are represented with a phylogenetic tree or network. Therefore, we need to exploit ancestral relationships. 

We want to overcome the limitations of the usual BWT-based indexing of a single genome, by extending the known approaches. Moreover, we plan to develop tools that allow to compare a set of reads, possibly a mixture of short and long reads with a set of pangenomes as well as other graph-based representations of gene structures. For this purpose, graph-based representations, where several millions of colors are used to encode the information of reads and their applications to pangenome comparison will be investigated to propose novel data structures in pangenomics.

Supervisor

Paola Bonizzoni  (UNIMIB)

Co-supervisors

Gianluca Della Vedova (UNIMIB)

Host institution

University of Milano – Bicocca
Department of Computer Science, Systems, and Communication

PhD program

Computer Science (http://phd-computer-science.disco.unimib.it/

PhD school (https://en.unimib.it/education/doctoral-research-phd-programmes)

Expected results

New data structures for representing pangenomes, new algorithms for querying and comparing sets of pan-genomes, and for comparing a set of pangenomes and a set of reads.

Required profile

Strong background in Computer Science, Mathematics, or related fields; good command of English. Good knowledge of a low-level programming language (C, C++, Rust) and experience with bioinformatics and advanced data structures (Burrows-Wheeler Transform, de Bruijn graphs) are welcome.

Applicants must satisfy the requirements of an Early Stage Researcher as defined by the MSC Work Programme: 1) On the starting date of your employment with the University of Milano – Bicocca, you are in the first four years of your research career and have not (yet) been awarded a doctoral degree; 2) You have not resided and/or have had your main activity (study, work, etc.) in Italy for more than 12 months during the 3 years prior to the starting date of your anticipated employment with the University of Milano – Bicocca. Applicants are expected to acquire the doctoral student status in the Doctoral Programme in Computer Science at the University of Milano – Bicocca during the standard 6-month probationary period. 

Early Stage Researcher requirements and employment conditions in the MSC Work Programme, are detailed at https://ec.europa.eu/research/mariecurieactions/resources/document-libraries/information-note-fellows-innovative-training-networks-itn_en

Application deadline

February 15th, 2021

Starting date 

The starting date is in early September 2021, with the exact date negotiable. 

Salary and benefits 

The position is full-time and funded for three years. The salary is competitive and complies with the MSC Work Programme: 3500 euros per month before taxes, consisting of Living and Mobility allowance after compulsory deductions. A conditional Family allowance of 385 euros can be added to the salary.  

How to apply

Please submit your application by sending directly by email the application to Professor Paola Bonizzoni (paola.bonizzoni@unimib) cc’ing Professor Gianluca Della Vedova (gianluca.dellavedova@unimib.it). The application shall include the following attachments as a single pdf file (in English): 

• CV with possible publications 

• Cover letter describing motivation, research interests, and declaration of satisfying the MSC Work Programme requirements for an Early Stage Researcher

 • Contact details of two potential referees who agreed to provide letters of recommendation. Applications will be given full consideration if received by February 15, 2021, 23:59 (CET). Applications received after this date will still be considered, until the position is filled.