Bioinformatics and Genomics


Gene Function and Evolution

Dr. Gian Gaetano Tartaglia (ICREA Research Professor from September 2014)
Dr. Benedetta Bolognesi, Dr. Teresa Botta-Orfila, Dr. Carmen Maria Livi
Dr. Silvia Rodriguez, Dr. Joana Ribeiro, Maria de las Nieves Lorenzo
Federico Agostini (received PhD in June 2014), Davide Cirillo, Domenica Marchese, Petr Klus, Riccardo delli Ponti (started PhD in May 2014)
Marta Baldrighi


Figure 1. Ribonucleoprotein networks (catRAPID omics express).

Our main interest is to understand the role played by RNA molecules in protein networks with special focus on disease-related pathways. Characterizing protein-RNA associations is key to unravel the complexity and functionality of mammalian genomes and will open up new therapeutic avenues for the treatment of a broad range of neurodegenerative disorders. Our research is highly interdisciplinary, involves a number of international collaborators and focuses on the uncharted territory of non-coding RNAs involved in i) transcriptional and translational regulation (X-chromosome inactivation, differentiation processes) and ii) neurodegenerative diseases (Parkinson’s α-synuclein, Alzheimer’s disease amyloid protein APP, TDP-43 and FUS).

Figure 2. Large scale predictions of protein solubility (ccSOL).


Our group plays a leading role in projects aiming to manipulate regulatory mechanisms controlling protein production and preventing formation of toxic aggregates. We are now in the process of consolidating our projects, which have lead to a number of high-impact publications, using advanced computational methods developed in my group and state of the art experimental techniques such as electromobility shift, filter binding assays and RNA-binding protein immunopurification-microarrays as well as Thioflavin T fluorescence spectroscopy.

Research Projects

  • Ribonucleoprotein networks. Our integration of in silico and ex vivo data unravel two major types of protein-RNA interactions, with positively correlated patterns related to cell cycle control and negatively correlated patterns related to survival, growth and differentiation 1.
  • Regulatory regions in nucleic acid sequences. The large amount of data produced by high-throughput sequencing poses new computational challenges. Our SeAMotE approach 2 provides (i) a robust analysis of high-throughput sequence sets, (ii) a motif search based on pattern occurrences and (iii) an easy-to-use web-server interface.
    • Large scale predictions of protein solubility. Using coil/disorder, hydrophobicity, hydrophilicity, β-sheet and α-helix propensities, we built a predictor of protein solubility 3. Our method allows (i) proteome-wide predictions; (ii) identification of soluble fragments within each sequences; (iii) exhaustive single-point mutation analysis.
    • Classification of protein based on sequence features. The cleverMachine algorithm analyses the physico-chemical properties of two datasets of protein sequences 4. The tool creates protein signatures for each of the proteins, utilizing a large set of protein features – both experimentally derived and statistically derived from other tools’ outputs.
    • Structural determination of protein and nucleic-acid features. Almost is an open source computational package for structure determination and analysis of complex molecular systems including proteins, and nucleic acids 5. Almost has been designed with two primary goals: to provide tools for molecular structure determination using various types of experimental measurements as conformational restraints, and to provide methods for the analysis and assessment of structural and dynamical properties of complex molecular systems.

Selected Publications

(cited in the order of Research Projects)

Cirillo, D et al.
“Constitutive patterns of gene expression regulated by RNA-binding proteins.”
Genome Biol, 15:R13 (2014).

Agostini F, Cirillo D, Ponti RD and Tartaglia GG.
“SeAMotE: a method for high-throughput motif discovery in nucleic acid sequences.”
BMC Genomics, 15:925 (2014).

Agostini F, Cirillo D, Livi CM, Ponti RD and Tartaglia GG.
“ccSOL omics: a webserver for large-scale prediction of endogenous and heterologous solubility in E. coli.”
Bioinformatics, btu420. doi:10.1093/bioinformatics/btu420 (2014).

Klus P et al.
“The cleverSuite approach for protein characterization: predictions of structural properties, solubility, chaperone requirements and RNA-binding abilities.”
Bioinformatics, 30:1601–1608 (2014).

Fu B et al.
“ALMOST: An all atom molecular simulation toolkit for protein structure determination.”
J Comput Chem, 35:1101–1105 (2014).