Text Mining, Information Extraction,     
and Assisted Curation in the biomedical domain


2021  IMPORTANT

This site will no longer be updated!!!! 

For up-to-date information about our activities check our new web site:

 https://nlp.idsia.ch/


NEWS (2019/2020)


BRIEF DESCRIPTION


OntoGene/BioMeXT is a research initiative which aims at pushing the boundaries of text mining for the biomedical literature. Our core competencies lie in Information Extraction, i.e. the extraction of domain-specific entities (such as genes, proteins, drugs, diseases), and their semantic relations, from the biomedical scientific literature. Our approach is based upon high-recall entity recognition, followed by relation extraction using a combination of shallow and deep methodologies (dependency parsing), combined with advanced machine learning techniques.

We consider community-run evaluation challenges as the best way to provide an independent and unbiased evaluation of text mining tools. We participated in several BioCreative challenges (since 2006), the BioNLP shared task (2009) and CALBC (2010). See below for details of our results.

Additionally, we provide an environment for Assisted Curation (ODIN), as an example of a real-world application of biomedical text mining. ODIN was initially developed as part of the SNF-funded project SASEBio, with also support from a pharmaceutical company. It was tested in the non-competitive interactive track of the BioCreative competitions, where it obtained favourable comments from the users. It was also tested in collaboration with the PharmGKB database at Stanford University [REF]. ODIN is currently being used in the curation pipeline of the RegulonDB database, in a project funded by the US National Institute of Health.

Our selected publications provide a good overview of the techniques used in the OntoGene system. We are based at the Institute of Computational Linguistics (Department of Computer Science) of the University of Zurich. Since 2016 our group is a member of the Swiss Institute of Bioinformatics. Our new group name is BioMeXT.

PAST HIGHLIGHTS

  • 2019
  • 2018
    • Funding:
      • The project MedMon (Monitoring of Internet resources for pharmaceutical research and development) approved and started. The project is a collaboration with HTW Chur and Roche on mining social media for drug-relevant information. The funding agency is InnoSuisse.
      • A new collaborative project with the VetSuisse faculty (Bern) has started. We are glad to be able to continue on the successful path established by our previous project.
      • We submitted an OpenMinTED tender application, which was accepted! As a result of our participation, BTH and OGER are available within the OpenMinTED infrastructure. The OpenMinTED aspires to enable the creation of an open infrastructure that facilitates the use of text and data mining technologies in the scientific publications world, building on existing text mining tools, and renders them interoperable through appropriate registries and a standards-based interoperability layer.
    • Conferences: Dr. Fabio Rinaldi is a co-organizer of the 4th Biomedical Linked Annotation Hackathon at DBCLS, Kashiwa, Japan.
    • Collaborations: Dr. Fabio Rinaldi visited the RegulonDB group within the scope of an NIH-funded collaboration.
    • Other: Lenz Furrer visited the NLP group at the Department of Information Technology, University of Turku, thanks to a 6 months SNF Doc Mobility grant.
  • 2017
    • Funding:
      The project "Automated detection of adverse drug events from older inpatients' electronic medical records using structured data mining and natural language processing", submitted within the "Smarter Health Care" National Research Programme (NRP74) has been approved.  OntoGene/BioMeXT will participate in this project, contributing natural language processing technologies for the automated analysis of medical records.
    • Publications:
    • Other: We participated in the TIPS shared task of the BioCreative/BECALM challenge, which evaluated web services for biomedical text mining. Our system had the best evaluation score for efficiency of annotation. Once again, OntoGene/BioMeXT confirms that it can deliver state-of-the-art and best-of-breed results.
  • 2016
    • Funding: Official start of the MelanoBase project on March 1st. Collaboration with Roche on text mining for competitive intelligence. COST action GREEKC [CA15205].
    • Conferences: Dr. Rinaldi scientific chair of SMBM 2016.
    • Other: Dr. Fabio Rinaldi nominated as Swiss representative in the Management Committee of the COST action GREEKC [CA15205].
    • Publications: paper about the BEL Track at BioCreative 2016.
    • Collaborations: part of the MelanoBase project will be carried out at the HTL-NLP group of the Fondazione Bruno Kessler (FBK), Trento, Italy.
  • 2015
  • 2014
    • Funding: NIH project in collaboration with the RegulonDB group (UNAM, Mexico) approved! Pilot project on large-scale detection of protein-protein interaction from the literature funded by Roche.
    • Conferences: co-organizers of SMBM 2014.
    • Collaborations: Joint paper on BioC implementations in collaboration with the Wilbur group at the National Center for Biotechnology Information / National Library of Medicine (NLM/NCBI) on BioC.
  • 2013
    • Evaluation: OntoGene's participation in BioCreative 2013 was a resounding success. More information and papers here.
    • Conference: Dr. Fabio Rinaldi co-chair of LBM 2013, December 12-13, University of Tokyo, Japan.
  • 2012
    • Conferences: we organized the 5th International Symposium on Semantic Mining in Biomedicine (SMBM)
    • Evaluation: Best overall results in the 'triage' task of BioCreative 2012 (in particular due to very accurate entity recognition), as described in this paper
  • 2011
  • 2010
    • Evaluation: Good results in all of the task of the Biocreative III competition; Best results in the 'species' and 'diseases' categories of the CALBC competition. 
    • Funding: SASEBio project approved (SNF).
  • 2009
    • Evaluation: Our system had the best results for the detection of protein-protein interactions in the BioCreative II.5 competitive evaluation (2009) [REF].
  • 2006
    • Evaluation: In BioCreative II best results in the detection of mentions of experimental methods and third-best results in the detection of protein-protein interaction from the literature [REF].

    For more details, see past highlights.

    FUNDING

    • From 2016 till 2018 our main source of funding will be the project MelanoBase, recently approved by the Swiss National Science Foundation. Additional funding will be provided by the PsyMine project, and by industrial collaborations.
    • From Aug 2010 to Jul 2014 we were funded by the Swiss National Science Foundation (SNF) through the SASEBio project (Semi-Automated Semantic Enrichment of the Biomedical Literature). Additional funding provided by Novartis and Roche.
    • Additionally in 2012-2013 a post-doc in our group (Gintarė Grigonytė) was funded by a Sciex fellowship (see BioTermEvo project).
    • In 2008/2009 we were funded by the SNF project Detection of Biological Interactions from Biomedical Literature (grant 100014-118396/1).


    Visual Summary of Current and Recent Projects

     




     
    MelanoBase (SNF)


     

    Author Name Disambiguation


     
    VetSuisse