Biomedical cancer research is a particularly data-heavy discipline, where key information sources are not only limited to genomic information or raw experimental data. Especially unstructured data, such as the scientific literature, clinical texts, medicinal chemistry patents or patient generated content, constitute a valuable resource for a range of scenarios like drug discovery, interpretation of large scale experimental results, drug repurposing or evidence based medicine. Medical big data approaches are only able to efficiently exploit running texts through the use of natural language processing (NLP) techniques relying on deep learning and artificial intelligence strategies. Our Unit is financed through the Plan for the Advancement of Language Technologies; the aim is to generate resources that can improve the exploitation of biomedical data by means of implementing and evaluating the underlying quality of systems for automatic recognition of medical concepts, generation of specialised neural machine translation models for the medical domain and the implementation of a medical language technology platform and software components for processing Spanish EHRs.
- (2017). Information Retrieval and Text Mining Technologies for Chemistry.. Chem Rev 117, 7673-7761.
- (2017). A molecular hypothesis to explain direct and inverse co-morbidities between Alzheimer’s Disease, Glioblastoma and Lung cancer. Sci Rep 7, 4474.
- (2017). LimTox: a web tool for applied text mining of adverse event and toxicity associations of compounds, drugs and genes.. Nucleic Acids Res (in press).