past projects‎ > ‎


The OntoGene group has recently entered a strategic partnership with the Data Science Group (Pharma Research and Early Development Informatics) at Hoffmann-La Roche with the aim of pursuing joint research in the identification  of mentions of protein-protein interaction from the biomedical literature.

The OntoGene group ( at the University of Zurich (UZH) specializes in mining the scientific literature for evidence of interactions among entities of relevance for biomedical research (genes, proteins, drugs, diseases, chemicals).
The Data Science group at Hoffmann-La Roche supports the development of projects in research and early development with the analysis, management and visualization of biological and chemical data using its expertise in chemoinformatics, text mining, data mining, information science, competitor information, pathway analysis and bioinformatics. 

The aim of the collaboration is the development of a system which is capable of automatically processing an input corpus of scientific articles. The system will be able to detect evidence for specific protein interactions described in the input documents. Given an input gene or protein, the system will locate all interactions of that gene/protein and present them as a ranked list, with evidence coming from all papers where it is mentioned. The interface will be structured in a way to allow easy inspection of the original evidence from the publications for any candidate interaction suggested by the system. The ranking computed by the system should take into consideration not only the local evidence in each paper, but also the global evidence across the collection. 

In summary, the system should be capable of:
1. identify all interactions in which a given protein is involved
2. rank them based on evidence in the literature
3. enable curation by an end user through a user-friendly interface

The planned system helps to tackle a recognized need for Computational Biology and other applications.  The project can leverage upon a significant amount of previous research in biomedical text mining performed by the OntoGene group. The quality of their text mining technologies has been proven through participation in several community-organized evaluation campaigns, where they often obtained top-ranked results (e.g. best results in extracting protein interactions, best results in identifying several categories of biomedical entities). Their research is supported by the Swiss National Science Foundation and their results are documented by numerous publications in prestigious journals. They have collaborated with several well-known databases, including PharmGKB (Stanford University) and the Comparative Toxycogenomics Database (Mount Desert Island Biological Lab, Maine).