ICBO 2015: my experiences

The International Conference on Biomedical Ontology 2015 (ICBO 2015) was in Lisbon from the 27th through the 30th of July. The ICBO 2015 brought together people from several communities involved in the use and development of ontologies in biomedical research, health care, and related areas.

me and my poster

me and my poster

The best of Lisbon

It was my first time in Lisbon and it was an amazing experience. Lisbon has beautiful places, nice summer climate, good restaurants, cafes and the unforgettable and famous “pastéis de Belém”.

My best moments at the ICBO

The ICBO conference included 2 days for tutorials and workshops. The other 2 days were for scientific sessions -presentations. The workshops and tutorials, just like , the poster and demo sessions were short, intensive and full of discussions. I participated with a poster and flash talk; it was a great opportunity to catch the attention of everybody at ICBO, they all got the message about my work… and it did work!!! I won the best poster award and I got good feedback. Below, I summarize the main activities of the conference.

Belém Tower

Belém Tower

The workshops and tutorials

Day 1

I attended IWOOD workshop (third version). This is a half-day event about definition of terms in ontologies. During this workshop, a common problem both designers and user of ontologies face was presented: the lack of explicitness in definitions of ontology terms. The definitions for terms are often incorrect, inconsistent, insufficient and/or missing. Daniel R. Schlegel and Peter L. Elkin propose in their paper the development of an ontology visualization tool based on the CSNePS GUI. They also propose an algorithm to order the undefined ontology terms to allow ontology developers to be most efficient in providing definitions. According to the authors, using these tools can easily show the number of undefined terms, and push developers toward improving the quality of their ontology. In the second part of IWOOD a method proposed by Bikash Gyawali, Claire Gardent and Christophe Cerisara was presented. It is for automatically generating descriptions of biological events encoded in the KB BIO 101 Knowledge base. Their probabilistic, unsupervised method differs from previous work in that (i) it is unsupervised and (ii) it focuses on n-ary relations and on the issue of how to automatically map natural language and KB arguments (the paper is available here).

Jerónimos Monastery

Jerónimos Monastery

The workshops and tutorials

Day 1

I attended IWOOD workshop (third version). This is a half-day event about definition of terms in ontologies. During this workshop, a common problem both designers and user of ontologies face was presented: the lack of explicitness in definitions of ontology terms. The definitions for terms are often incorrect, inconsistent, insufficient and/or missing. Daniel R. Schlegel and Peter L. Elkin propose in their paper the development of an ontology visualization tool based on the CSNePS GUI. They also propose an algorithm to order the undefined ontology terms to allow ontology developers to be most efficient in providing definitions. According to the authors, using these tools can easily show the number of undefined terms, and push developers toward improving the quality of their ontology. In the second part of IWOOD a method proposed by Bikash Gyawali, Claire Gardent and Christophe Cerisara was presented. It is for automatically generating descriptions of biological events encoded in the KB BIO 101 Knowledge base. Their probabilistic, unsupervised method differs from previous work in that (i) it is unsupervised and (ii) it focuses on n-ary relations and on the issue of how to automatically map natural language and KB arguments (the paper is available here).

In the afternoon, I attended the tutorial for the Biological Pathway Exchange (BioPAX) Ontology and the Pathway Commons Database. BioPAX is a standard language that aims to enable integration, exchange, visualization and analysis of biological pathway data. This standard is defined in OWL DL and is represented in the RDF/XML format. Specifically, BioPAX supports data exchange between pathway data groups. ChiBE and Paxtools are tools designed for accessing, editing and visualization data in BioPaX format.

Rua Augusta Arch

Rua Augusta Arch

Day 2

I attended the tutorial organized by Barry Smith about the Basic Formal Ontology (BFO). This was a useful opportunity for understanding basic principles of this upper ontology. The tutorial is focused in the most recent version, BFO 2.0. In this version they propouse a classification of material entitie; specifically, the category “object”. The video introduction to BFO 2.0 is available here.

In the afternoon James A. Overton, gave an interesting tutorial about OBO: Best Practices for Ontology Use. The OBO foundry includes a set of principles for ontology development with the goal of creating interoperable ontologies in the biomedical domain. Some of these good practices recommended by OBO suggest that the ontologies must be open; available, preferably, in OWL; each ontology term must have a unique URI identifier and a textual definition. This tutorial presented nice tools like OntoMaton that facilitates finding and reusing ontology terms from BioPortal. It also works over OntoFox, and Ontorat. Ontorat is complementary to OntoFox. Ontorat and OntoFox are complementary in the sense that OntoFox supports the reuse of existing ontology terms and Ontorat supports the automatic generation of new ontology terms, axioms and annotation of ontology terms. It would be great to integrate the functionalities of both platforms in one; developers could save a lot of time!!!!

Padrão dos Descobrimentos

Padrão dos Descobrimentos

Keynotes and paper sections

Day 3

The third day was packed with presentations, flash talks and poster & demo sessions. Helen Parkinson gave the first keynote. Helen showed data resources and tools spanning from DNA and protein sequence to complex pathways and networks (some examples: CTTV, IMPC, IMPReSS and ZOOMA. The data storage in these repositories are described in a machine readable form and are queryable in a standard way that allows access to pertinent information including phenotypic relationships and distinctions. To enable these functionalities they use ontologies (core vocabularies) as infrastructure to provide sets of common terms, defined synonyms, and also to specify relationships amongst terms. With examples, Helen showed us how computer sciences has contributed with the development of computational tools for designing complex vocabularies that facilitate information retrieval and linking data.

In the paper sessions, a presentation about the Information Artifact Ontology (IAO) got my attention…I am reusing this ontology to represent an experimental protocol as a document (you can see more here). IAO was created for the representation of types of information content entities (ICEs) such as documents, databases, and digital images. ICEs are marked by the feature of aboutness, for example, when a name or address or data entry is about some phenomenon (a note that say “ICBO 2015 takes place on 27-30 July of 2015 in Lisbon, Portugal”). Sometimes these ICEs concretized in any type of scholarly communication are partially addressed in IAO. Barry Smith presented how they are addressing that problem (the presentation is available here).

Day 4

Chris J.O Baker, presented BIM. BIM is an open ontology for the annotation of biomedical images. BIM specifically addresses the provenance and crowd image annotation issues during biomedical image manipulation.

Catarina Martins, who won the best early career paper award, presented a preliminary visualization tool that supports the identification of mapping incoherence, by displaying sets of mappings involved in logical conflicts between several BioPortal ontologies pairs, as well as the classes and axioms involved.

I am finishing this blog by mentioning another great piece of research; the work presented by Pablo López-García (winner of best paper award). He and his colleague, Stefan Schulz, have studied SNOMED-CT and proposed an algorithm for extracting modules that preserve the shape of SNOMED CT; they call them ‘balanced modules’. Their algorithm, although is in an initial stage, show that SNOMED-CT can be ‘squeezed’ without losing its shape.

In summary, 4 awesome days from both an ontological and personal perspective. The ontologies are both, a productive part of the current bioinformatics environment as well as an integral component of the future of bioinformatics.

Advertisements

Help us to Improve the Reproducibility of a Laboratory Protocol

Protocols are at the heart of experimental research. Protocols describe the experimental workflows to be followed in laboratories. A well-constructed protocol ensures a common understanding of the study; how to go about procedures, quantities for everything involved in the execution of the experimental workflow, the details are in the protocols. Also, protocols are a specific type of scientific publication, one that is widely used and shared in laboratory practices, it is a structural part of laboratory notebooks. Experimental protocols are decisive in permitting the reproducibility and the successful replication of experiments.

Normally, the detailed notes about experimental procedures,  the order in which they are executed, implementation,  the type of materials and the variety of methods used by a researcher are available only for those inside the research group where the experiment is being carried out. When a laboratory protocol is published, usually the description of the process is sometimes insufficient. For instance, the instruction “Centrifuge for 10 min” does not include the velocity of centrifugation; ambiguity and imprecision are often found, “Incubate at -20°C overnight” . But, how long is overnight? Natural language is excellent for human interoperability but a poor choice for machine processing.

We are proposing a Minimum Information (MI) standard or guidelines to report a laboratory protocol, miProtocol. miProtocol comes from analyzing 100 laboratory protocols published in 9 repositories, namely: Biotechniques, Cold Spring Harbor Protocols, Current Protocols, Genetics and Molecular Research, Jove, Protocol Exchange, Plant Methods, Plos One and Springer Protocols.

Help us to improve our understanding of laboratory protocols. Here is a questionnaire that will make it easier for us to explore this space of information.

About my Experience at Beyond the PDF2

I had an amazing time at Beyond the PDF 2 in Amsterdam (March 19-20). Thanks to the travel fellowship sponsored by Elsevier. This event was a great opportunity to meet people with diverse backgrounds (scholars, technologists, policy experts, librarians, start-ups, publishers, …) all interested in making scholarly and research communication better. Let me show you some personal highlights about this conference.

 

Keynotes

During these two days, we had two great keynotes. One of them was Kathleen Fitzpatrick from the Modern Language Association. She discussed how it is essential for humanities to embrace new forms of scholarly communication. For instance,  blogs; which provide a faster alternative  for disseminating information in comparison to more traditional channels, such as books. In my opinion, the blog is a winner in accessibility. But, the book still has better quality control. 

Another keynote, Carol Tenopir (Chancellor’s Professor at the School of Information Sciences at the University of Tennessee, Knoxville) discussed the practice of reading for academics. She has done in-depth tracking of how scientists read. The scholars read more now than ever;  scholarly material remains essential for research and teaching. But, the way they find information and read is changing. She also pointed out that e-books have been gaining a lot of popularity.

 

Demos and Posters

I found a lot of interesting tools to improve scholarly communication. One of them,  the StemBook. It was presented by Lisa Girard, it is a portal of topics related to stem cell biology and protocols, allowing the scientific community to keep their findings up to date. Also, I found very interesting to see Domeo; this is a tool that supports annotation of scientific documents, it was developed by Paolo Ciccarese. Aleix Garrido from Intelligent Software Components (iSOCO), showed a prototype that indicates the reliability of scientific workflows by measuring and monitoring the completeness and stability of this information over time. Finally, Daniel Garijo from the Ontology Engineering Group, talked about how to quantify the reproducibility of Tuberculosis Drugome workflow. I hope all these ideas surpass the prototype stage and soon become real for everyone to use.

 

Making it happen

This section was programed the second day. Asunción Gómez Pérez from the Ontology Engineering Group talked about the SEALS evaluation platform, which allows reproducing the different tests of an experiment automatically. Also, during this section the organizers we gave a challenge….What would you do with 1k today that would change scholarly communication for the better?  and of course, I am participating in this challenge. I want to stimulate the analysis of 100 pre-selected published papers (centered on “Materials and Methods”) for reproducibility purposes. This idea is related with my research exposed in a flash talk during the section vision of the future.

 

People making connections

It was great to see publishers, technologists, researchers and librarians talking to each other. An interesting outcome is the http://scholrev.org/ working group. They are aiming at changing and challenging the status quo. Although yet too early… it is refreshing to see people willing to push the change not from the mainstream but from the grassroots -these are usually the movements that deliver significant deep changes. There is the CITAGORA project, sorry but I did not get a URL for it 🙂 It seems they are trying to extract just some meaningful information from scientific PDFs. They are enriching the bibliographic reference by jailbreaking the corresponding PDF. Once liberated, then they are addressing extracting specifics from the content. The result of this extraction is being modeled as RDF and located around the initial bibliographic reference; in this way the resulting dataset is enriched.

 

 

 

 

 

 

 

 

 

New to blogging

Hi everybody, this is my most recent activity. I find it hard to find my self doing this but here I am, blogging. I had this account since 2012 but it is just now that I am planning to really use it. Hopefully I will have interesting things to say. I would like to blog about papers I am reading, places I am visiting, conferences and workshops I am attending and most importantly people I get to know.