Bioinformatics and Genomics


Comparative Bioinformatics

Cédric Notredame
Miquel Orobitg, Darek Kedra
Jose Espinosa, Pablo Prieto, Maria Hatzou, Evan Floden
Ionas Erb, Cedrik Magis, Paolo di Tomaso


The main focus of the group is the development of novel algorithms for the comparison of multiple biological sequences. Multiple comparisons have the advantage of precisely revealing evolutionary traces, thus allowing the identification of functional constraints imposed on the evolution of biological entities. Most comparisons are currently carried out on the basis of sequence similarity. Our goal is to extend this scope by allowing comparisons based on any relevant biological signal such as sequence homology, structural similarity, genomic structure, functional similarity and more generally any signal that may be identified within biological sequences. Using such heterogeneous signals serves two complementary purposes: (i) producing better models that take advantage of the evolutionary resilience, (ii) improving our understanding of the evolutionary processes that leads to the diversification of biological features. For this purpose we are developing methods for the comparison of protein sequences, protein structures, RNA sequences and structures as well as complete genomes. We apply these methods to a wide range of biological questions that include: Leishmania Donovani resistance, animal and plant domestication, human and other model system gene annotation. We are also applying similar algorithms to longitudinal data analysis in order to mine recordings for predictive patterns, with a special interest in the obesity onset in murine models. All the tools we develop are open source freeware that can either be downloaded for personal use or accessed through dedicated web interfaces at

Research Projects

  • Development of multiple sequence aligners: T-Coffee
  • Homology Modelling of Non Coding RNA
  • Large Scale Protein Sequence Alignments
  • Multiple Genome Alignments
  • Longitudinal data modelling
  • Structure based multiple sequence comparison and classification
  • Protein and RNA structural evolution
  • HPC Computation tools: Nextflow

Selected Publications

Chang, J et al.
“TCS: a new multiple sequence alignment reliability measure to estimate alignment accuracy and improve phylogenetic tree reconstruction.”
Mol Biol Evol, 31(6):1625-1637 (2014).

Earl D et al.
“Alignathon: a competitive assessment of whole-genome alignment methods.”
Genome Res, 24(12):2077-2089 (2014).

Pervouchkine et al.
“Enhanced transcriptome maps from multiple mouse tissues reveal evolutionary constraint in gene expression.”
Nat Commun, 6:5903 (2014).

Di Tommaso, P et al.
“SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments.”
Nucleic Acids Res, 42(Web Server issue):W356-60 (2014).

Yue et al.
“A comparative encyclopedia of DNA elements in the mouse genome.”
Nature, 515(7527):355-364 (2014).