Medialab-Prado, the cultural centre of the Arts, Sport and Tourism Office of the Madrid City Council, has invited the Spanish National Cancer Research Centre (CNIO) to join its Ojo al Data project. Since March, this project has been dedicated to studying the big data phenomenon, i.e. the exponential increase in the volume of information generated by scientific, economic, social and cultural processes. On July 2nd, the CNIO responded to the invitation by opening the doors of its Bioinformatics Unit and its Medicinal Chemistry Section, where visitors were able to witness the potential of this phenomenon for cancer research.
The big data concept refers to the management and interpretation of massive amounts of data that, due to the volume, variability and speed at which they are generated, cannot be processed using traditional tools. The collective imaginary often identifies this with large technological or telecommunications companies but, although unknown to the public, it is also very important for science. Nowadays, in a dedicated biological, chemistry or medical research centre it is very difficult to conduct experiments or understand them without the use of advanced computational tools. In the case of oncology in particular, this has had profound implications as it makes DNA sequencing faster and cheaper, revolutionising the study of the genetic bases of cancer.
“It is important for people to know that a research centre like the CNIO conducts mass experiments that generate data at the same rate as large companies, and for them to realise that information technologies are essential if we want to understand the genetic and molecular bases of cancer,” explains David González Pisano, head of the Bioinformatics Unit. Established by biologists, bioinformacists and engineers, this Unit supports the CNIO research groups that need to consult mass data during their experiments to find a response to each biological issue that arises.
‘BIG DATA’ TO UNDERSTAND DISEASES
“Human beings are estimated to have about 37 trillion cells and most of them contain two copies (paternal and maternal) of a DNA molecule,” says González Pisano. “In turn, this molecule contains 3 billion chemical letters that form about 20,000 genes. Each cell produces approximately 250,000 proteins each second. These numbers are difficult to assimilate and, of course, to handle. Current biotechnology techniques enable us to capture molecular pictures of part of those biological systems and observe, in a unique experiment, the status and composition of millions of these elements. This generates massive amounts of data that enable us to understand the diseases we are studying, but to make it possible we have to manage and analyse these data using computational technologies.”
Participants also visited the Medicinal Chemistry Section of the Experimental Therapeutics Programme, dedicated to the early stages of drug discovery. “Based on the molecules that have already proven effective for a type of cancer in animal models, we obtain potential anticancer drugs,” explains Sonia Martinez, head of the Section. “During the compound optimisation process, which includes many in vitro and in vivo tests, we generate a large amount of data for each chemical compound.” The Section processes these data automatically through a platform that manages the information from a collection of 50,000 compounds, 28,000 chemical reactions, 700,000 biological results, and that provides access to suppliers of more than 5 million chemical compounds.
The visit is part of the Ojo al Data project under the framework of which a number of conferences, workshops and discussions have been held, as well as the BBVA InnovaChallenge Data Week at the BBVA Innovation
Centre, or the ‘Big Bang Data’ exhibition and the ’Vivir en un mar de datos’ (‘Living in a sea of data’) conference, both organised by Fundación Telefónica. The organiser of Ojo al Data, Medialab-Prado, is conceived as a citizen laboratory for the production, research and dissemination of cultural projects that explore collaborative forms of experimentation and learning that have emerged from digital networks.