-
Notifications
You must be signed in to change notification settings - Fork 7
Comments on GO schema #85
Description
Mostly looks great! Feel free to close if these comments not helpful or in the wrong place
generalizability
This schema generalizes to most ontologies, I don't know if you intend to replicate for ENVO etc. It may be an idea to consolidate to a general ontology schema
alignment to obograph json model
You may be interested in https://github.com/geneontology/obographs - this json serialization is being adopted by more ontologies. It is a more convenient substrate for parsing (no ad hoc syntactic parsing). You may also want to consult it as the reference model. It is well suited to graph databases.
The philosophy behind the obographs model plus thoughts on obo, owl, and json here: https://douroucouli.wordpress.com/2016/10/04/a-developer-friendly-json-exchange-format-for-ontologies/
synonyms
Synonyms can be thought of as n-ary tuples or properties with their own meta-properties, they include scope (required) type (optional) and provenance/xrefs (optional list). You seem to be flattening to a string - maybe a consequence of the biopython parser?
This will be awkward if you want to use the synonyms to implement a search index, as the indexer will need to do an initial layer of syntactic parsing.
You can see how we model this in json here:
The same applies to other properties, e.g. text definitions should be attributed to sources, but this may be less important for your use case.
separate subschemas for is-a and relationship
This follows the syntax of obo but I would recommend collapsing these and having a generic edge type, as in obographs
(the distinction between is-a and relationship is best understood from an OWL perspective, but this is likely not such a useful level of abstraction for you)
intersection_of
It's like you don't need this. If you do have a need for logical definitions, you will likely want to use the json or the owl, and load go-plus.
Mungall, C.J., Bada, M., Berardini, T.Z., Deegan, J., Ireland, A., Harris, M.A., Hill, D.P., and Lomax, J. (2011). Cross-product extensions of the Gene Ontology. Journal of Biomedical Informatics 44, 80–86. https://doi.org/10.1016/j.jbi.2010.02.002
Mungall, C.J., Dietze, H., and Osumi-Sutherland, D. (2014). Use of OWL within the Gene Ontology. In Proceedings of the 11th International Workshop on OWL: Experiences and Directions (OWLED 2014), M. Keet, and V. Tamma, eds. (Riva del Garda, Italy, October 17-18, 2014), pp. 25–36. https://doi.org/10.1101/010090
disjointness axioms
As above, the OWLED paper gives an idea how they are used - mostly during ontology development and annotation validation/QC, less useful for data analysis/querying.
Relationship types
obo format has a stanza type called Typedef (which is a terrible name). This should have been called 'relationship type'. I think it may be useful for you to explicitly model this. Relationship types can have properties that may be useful to you.
Most OBO ontologies (including GO) draw their relationship types from RO:
http://obofoundry.org/ontology/ro
The documentation on the RO wiki may be useful for thinking about how relations can be used in your RE
The (incomplete) NCATS Translator Ontology Primer I worked on with Marcin may also be useful:
https://docs.google.com/document/d/1faKOTMOrH4vn8NHx9s8WzdDujOkKu0RwaYZPrdtmnoI/edit