As a result of our participation in the BioCreative IV challenge (task 3, 2013), we provide web services that offer the capability to identify gene, chemical, disease, and action term mentions within a PubMed abstract provided as input. The services provide the span of the identified terms (using character offsets) as well as their unique identifiers. The input and output of the services follow the BioC standard. The recognized terminology (as well as the identifiers) is based on CTD's controlled vocabulary structure. See this page for details of usage.The OntoGene system has shown several times high quality results in the detection of biomedical entities in text. For example, in 2010 our system achieved the best results in the categories "genes" and "diseases" in the CALBC competition. We achieved best results in the CTD task of BioCreative 2012, as the figure below shows (OntoGene is group number 116). [Note, April 2016] We have created novel web services which will considerably expand the capability of the system described above. These services allow the submission (via a RESTful interface) of a plain text document, or a PubMed ID, and return the biomedical entities detected in the input text, in BioC format, tsv, or plain XML. The entities detected include genes, diseases, chemicals, cell lines, species. The new service will be made publicly available as soon as we have completed some testing. Fabio Rinaldi,
Simon Clematide,
Hernani Marques, Tilia Ellendorff, Martin Romacker, and Raul
Rodriguez-Esteban. OntoGene web services for biomedical text mining. BMC Bioinformatics 2014, 15(Suppl 14):S6
doi:10.1186/1471-2105-15-S14-S6 |
Resources >