Of the 614 cell lines studied, 7% closely corresponded to a different type of cancer than that thought to give rise to the cell line.
The Genome Data Science Laboratory at IRB Barcelona has drawn up a reference list of 366 cell lines in which the genetic pattern corresponds to the type of tumour that they intend to model.
The work shows that studies using the cell lines on this reference list have a greater discovery rate, and has been published in the journal Science Advances.
In cancer research, cell lines are an essential tool to study tumour biology and to test drugs or therapeutic approaches. Led by the ICREA researcher Fran Supek, the Genome Data Science Laboratory at IRB Barcelona has evaluated the adequacy of 614 cell lines and has concluded that only approx. 60% greatly conserve the characteristics of the tumour of origin, while another 20% maintain some resemblance. The remaining 18% differ considerably from the tumour that they supposedly represent. Therefore, they provide a less reliable model of the tumour type they are meant to represent. Surprisingly, 7% of all cancer cell lines used in labs worldwide closely correspond to a completely different type of cancer than that they are originally believed to be.
Different types of cancer, such as colon or lung cancer, are distinct diseases that evolve and are treated in a different manner. Cell lines for cancer research are typically generated from a surgical tissue sample from a patient's tumour. They are grown in the laboratory and serve as a study model for that disease. For research projects that target a particular type of cancer, it is crucial that the cells used in the laboratory truly represent the type of cancer they are supposed to represent.
“Identifying cell lines that are very different from the tumour of origin is crucial because they can lead to false positive or false negative results in experiments,” says Supek. "Promising new candidates for cancer drugs may remain undiscovered because the tissue-of-origin of cell lines was not correctly identified," he explains.
Large-scale data analysis with immediate application
This is the first large-scale data analysis that jointly considers genetic, gene expression and epigenetic patterns of more than 600 cancer cell lines, together with 9,000 samples from 22 types of human tumours. Using machine-learning methods, the Genome Data Science Laboratory has compiled a list of 366 cell lines that reliably correspond to the tumour of origin. This list is available as a resource to the scientific community as a part of the article, published in the journal Science Advances.
The researchers have also analysed the results of previous experiments by other labs that have used many cell lines in parallel. They have shown that, by eliminating misclassified cell lines or using only cell lines from the reference list, we are better placed to discover new drugs that target specific markers. Their data analyses point to several new molecules that may show activity against pancreatic or brain cancer cells that bear particular mutations.
“We hope that prioritizing the cell lines that we have selected will help researchers obtain more reliable experimental results when searching for new cancer treatments,” says Supek.
Ovarian cancer or skin cancer?
One of the cell lines studied, ES-2, derives from sample of an ovarian tumour surgically removed from a 47-year-old woman. However, the scientists observed that the pattern of activated and silenced genes shown by this cell corresponds to that of skin cancer. "Then we studied the specific mutations in this cell line, and we saw that they have a mutational signature that is caused by UV rays. We think that this ovarian tumour may have in fact originated from a metastasis of a skin cancer and that is why it has these characteristics,” says Marina Salvadores, the lead author of this study. “In total, we identified five different cell lines that may originate from skin cancer rather than the various internal organs that they were thought to come from. This seems consistent with the high propensity of melanoma to metastasize to various organs.”
Marina Salvadores is a PhD student with the Genome Data Science Laboratory of the IRB Barcelona and she holds a FPU fellowship. Fran Supek is an ICREA research professor, head of the Genome Data Science Laboratory and an EMBO Young Investigator. The work in the laboratory is funded by the ERC Starting Grant “HYPER-INSIGHT” and the Spanish Ministerio de Ciencia e Innovación grant “RegioMut”.
Marina Salvadores, Francisco Fuster-Tormo & Fran Supek
Matching cell lines with cancer type and subtype of origin via mutational, epigenomic, and transcriptomic patterns
Science Advances (2020) DOI:10.1126/sciadv.aba1862
About IRB Barcelona
Created in 2005 by the Generalitat de Catalunya (Government of Catalonia) and University of Barcelona, IRB Barcelona is a Severo Ochoa Centre of Excellence, a seal that was awarded in 2011. The institute is devoted to conducting research of excellence in biomedicine and to transferring results to clinical practice, thus improving people’s quality of life, while simultaneously promoting the training of outstanding researchers, technology transfer, and public communication of science. Its 27 laboratories and eight core facilities address basic questions in biology and are orientated to diseases such as cancer, metastasis, Alzheimer’s, diabetes, and rare conditions. IRB Barcelona is an international centre that hosts 400 employees and more than 30 nationalities. It is located in the Barcelona Science Park. IRB Barcelona is a CERCA center, and a member of the Barcelona Institute of Science and Technology (BIST).