Mechanisms of Disease Comparative Genomics

Comparative Genomics

Our research interests are focused around the use of comparative genomics and phylogenomics to study the origin, evolution and function of complex biological systems. This includes understanding how specific biochemical pathways, protein complexes or cellular organelles emerged and evolved as well as using this evolutionary information to gain insight into their function.

Through collaborations with experimental groups we apply comparative genomics to discover new mechanisms and genes involved in interesting processes, especially those of clinical relevance (see lines of research). On the technical side, our work often involves the development of new bioinformatics tools and algorithms that we make available to the community.

Phylogenomics and genome evolution

In the genomic era it has been possible to move from the evolutionary analysis of single protein families (phylogenetics) to that of complete genomes and proteomes (phylogenomics). To achieve this transition new tools have been developed that allow the large-scale reconstruction of thousands of phylogenetic trees in an automatic way. Interpreting such type of complex data poses many difficulties and does require the development of novel algorithms, tools, forms of representing the data and even new semantics and concepts. We combine the development of original algorithms to treat phylogenomic data with its application to gain knowledge on problems of biological relevance. We have recently started an ERC-funded project entitled RETVOLUTION: “processes and patterns of reticulate evolution in eukaryotes”, which will combine innovative computational and experimental approaches to infer patterns of reticulate evolution across the eukaryotic tree, and relate this to current biological knowledge. A particular focus is placed on elucidating the role of hybridization in the origin of whole genome duplications, and in facilitating the spread of horizontally transferred genes.

Comparative genomics and population genomics of fungal pathogens

Fungal infections constitute an ever-growing and significant medical problem. Diseases caused by such pathogens range from simple toe nail infections, to life-threatening systemic mycoses in patients with impaired immune systems. The molecular mechanisms driving invasion of mammalian hosts by fungal pathogens poses many scientifically challenging problems, which are as yet little understood. The ability to infect humans has emerged in several lineages throughout the fungal tree of life. Therefore, the problem of elucidating the mechanism for pathogenesis of fungi, as proposed here, can be approached with an evolutionary perspective by detecting specific adaptations in pathogenic lineages. During the last years we have clarified the evolutionary paths to virulence of major fungal pathogens such as Candida glabrata and Candida parapsilosis.

Microbiome-host interactions in health and disease

As a natural progression of our interest in the role of fungi in health and disease, the group started working on the analysis of the microbiome. Our main focus within this field is quantifying and understanding the role of the fungal component, which is usually neglected by mainstream analyses. In the last three years we have engaged in several massive projects, which include the oral microbiome project “Saca La Lengua” (, with over 1500 samples, the lung microbiome in patients with chronic respiratory syndrome, the lung microbiome in the intensive care unit patient (with artificial ventilators), and the intestinal microbiome in colon cancer. These projects involve analysis of bacterial metagenomes and of the fungal component. For the analysis of the fungal microbiome we have developed novel approaches that overcome existing limitations of the traditional ITS-based sequencing.

Evolutionary genomics of long, non-coding RNAs

Recent genomics analyses have facilitated the discovery of a novel major class of stable transcripts, now called long non-coding RNAs (lncRNAs). A growing number of analyses have implicated lncRNAs in the regulation of gene expression, dosage compensation and imprinting, and there is increasing evidence suggesting the involvement of lncRNAs in various diseases such as cancer. Despite recent advances, however, the role of the large majority of lncRNAs remains unknown and there is current debate on what fraction of lncRNAs may just represent transcriptional noise. Our group has recently embarked into a project, funded by the European Research Council that aims to combine state-of-the-art computational and sequencing techniques in order to elucidate what evolutionary mechanisms are shaping this enigmatic component of eukaryotic genomes.

Evolution of eukaryotes

Every eukaryotic organism shows a high level of sub-cellular compartmentalization that is significantly more intricate than the most complex prokaryotic cell. How such degree of complexity came to be is still not fully understood. In this context, endo-symbiotic events with bacterial organisms have been proposed to be the source of a number of organelles including mitochondria, chloroplasts and peroxisomes. Only recently, it has been possible to contrast these hypotheses with the growing availability of completely sequenced genomes and organellar proteomic data. In the last years, we have used large-scale evolutionary analyses to investigate the origin and evolution two most widespread organelles for which an endosymbiotic origin has been proposed: mitochondria and peroxisomes.

Late acquisition of mitochondria by a host with chimeric prokaryotic ancestry. Nature, 101 (531), doi: 10.1038/nature16941-104 (2016)
Pegueroles C, Gabaldón T.
Secondary structure impacts patterns of selection in human lncRNAs.BMC biology , 14 doi: 10.1186/s12915-016-0283-0-60 (2016)
Marina Marcet-Houben and Gabaldón T.
Beyond the whole genome duplication: phylogenetic evidence for an ancient inter-species hybridization in the baker’s yeast lineage.PLOS Biology , 8 (13), doi: 10.1371/journal.pbio.1002220-e1002220 (2015)
Green RE, (48 other authors), Gabaldón T, Paten B, Ray DA.
Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs., 6215 (346), doi: 10.1126/science.1254449 (2014)
Supek F, Miñana B, Valcarcel J, Gabaldón T, Lehner B.
Synonymous mutations frequently act as driver mutations in human cancer., 6 (156), doi: 10.1016/j.cell.2014.01.051-1324-1335 (2014)
Gabaldón T.
A metabolic scenario for the evolutionary origin of peroxisomes from the endomembranous system.Cell Mol Life Sci., 13 (71), doi: 10.1007/s00018-013-1424-z-2373–2376 (2014)
Pryszcz LP, Németh T, Gácser A, Gabaldón T.
Genome comparison of Candida orthopsilosis clinical strains reveals the existence of hybrids between two distinct subspecies.Genome Biol Evol., 5 (6), doi: 10.1093/gbe/evu082.-1069-78 (2014)
Huerta-Cepas J, Capella-Gutiérrez S, Pryszcz LP, Marcet-Houben M, Gabaldón T.
PhylomeDB v4: zooming into the plurality of evolutionary histories of a genome.Nucleic Acids Res., 1 (42), doi: 10.1093/nar/gkt1177-D897-902 (2013)
Gabaldón T, Koonin EV.
Functional and evolutionary implications of gene orthology.Nat Rev Genet. , 5 (14), doi: 10.1038/nrg3456-360-6 (2013)
Marcet-Houben M. and Gabaldón T.
Acquisition of prokaryotic genes by fungal genomes.Trends in Genetics, 1 (26), doi: 10.1016/j.tig.2009.11.007-5-8 (2010)

This group receives financial support from the following sources:

  • Ministerio de Ciencia, Innovación y Universidades
  • Fondo Europeo de Desarrollo Regional (FEDER)
  • AGAUR - Generalitat de Catalunya
  • H2020 research framework